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Foreword 


This volume collects the refereed contributions based on the presentation made at 
the sixth workshop on the theme of advanced mathematical and computational tools 
in metrology, held at the Istituto di Metrologia “G.Colonnetti” (IMGC), Torino, 
Italy, in September 2003. The aims of the European Project now supporting the 
activities in this field were 


To present and promote reliable and effective mathematical and computational 
tools in metrology. 


To understand better the modelling, statistical and computational requirements 
in metrology. 


To provide a forum for metrologists, mathematicians and software engineers 
that will encourage a more effective synthesis of skills, capabilities and 
resources. 


To promote collaboration in the context of EU Programmes, EUROMET and 
EA Projects, MRA requirements. 


To support young researchers in metrology and related fields. 


To address industrial requirements. 


The themes in this volume reflect the importance of the mathematical, statistical and 
numerical tools and techniques in metrology and also keeping the challenge 
promoted by the Meter Convention, to access a mutual recognition for the 
measurement standards. 


Torino, February 2004 The Editors 
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A new kind of artefact, based on a modification of the hexapod machine’s well-known 
structure, has been introduced by Antunes, S. D. ef al in [1], in order to determine the 
global errors of coordinate measuring machines. Here we are presenting results from 
validation of the technique: using a self-calibrated method and modeling the reference 
value for calibration based on laser trilateration. 


1. Introduction 


Uncertainty, as defined in the ISO Guide to the Expression of Uncertainty in 
Measurement (GUM) [4] and in the International Vocabulary of Basic and 
General Terms in Metrology (VIM), is a parameter, associated with the result of 
a measurement, which characterizes the dispersion of the values that could 
reasonably be attributed to the measurand. 

Calibration and numerical error correction of coordinate measuring 
machines (CMMs) require an error behavior mathematical model and methods 
to assess the errors. Traceability of CMMs must be based on traceability chains 
and commonly accepted methods for uncertainty evaluation [3]. 


2. Calibration Artefacts for CMMs 


Different artefacts (geometrical gauges) can be used to perform the calibration 
of large CMMs (with one or more axis length bigger than 2 m). Between the 
calibrated artefacts, we can refer the 2D lightweight ball plates (with carbon 
fiber rod structure and ceramic spheres), the 2D disassemblable ball plates (with 
L-shape, for example, made of carbon fiber tubes) or the 1D disassemblable 
multi-ball bars (of carbon fiber pipes), as used by [2]. Alternatively, to perform 
this task, the calibration can be made with the use of an arrangement of laser 


interferometers and level meters or straight edges to access the full error 
analysis, but these methods require expensive tools, are time consuming and 
need especially skilled personnel. In addition, it is possible to use uncalibrated 
artefacts, but it is necessary to have absolute length standards to determine scale 
factors. In this last case, it is necessary to place the artefact in substantially more 
positions then when using calibrated artefacts. 

For small and medium size coordinate measuring machines there are several 
artefact-based methods for full error analysis, which is a prerequisite for 
establishing traceability. Like the case of large CMMs, there are calibrated and 
uncalibrated artefacts for this purpose. The most common calibrated artefacts 
are the gage blocks, step gages, the ball bars, the ball plates and the hole plates’. 
In addition, laser interferometers are also used in order to perform the 
measurement of linear accuracy, straightness and angularity checks. 
Additionally, there are uncalibrated devices, like the artefact presented in [1], 
which is the object of the main study described in this text. 

Artefacts are effective for local and relative calibrations of CMMs. We 
have compared four different types of reference artefacts [5]: 

ə the single sphere (see figure 1); 

e the step gauge block (see figure 2); 

e the ball plate (see figure 3); 

e the hole bar (see figure 4). 

The single sphere is used to calibrate the probe of the CMM and a good 
metrological sphere of known and calibrated diameter is needed. All the other 
listed artefacts are used to calibrate the CMM, in order to find systematic errors 
on the CMM’s measurement volume. 





Figure 1. The sphere. Figure 2. The step gauge block. 





ê Like the presented artefact in http://www. 1gg.com/html/calibcemm.htmi. 
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Figure 3. The ball plate. Figure 4. The hole bar. 





Figure 5. Trying to calibrate CMM’s volume measurement with different supports for the ball plate. 


The step gauge block enables a linear calibration in a direction 
perpendicular to the reference surface of the gauges. 

The ball plate allows the calibration in the plane of the center of the balls 
constituting the ball plate (the reference points are the center of the balls in its 
plane). 

The hole bar allows a linear calibration in the plane of one surface of the 
holes (the reference points are the center of the holes in that plane). But the hole 
bar also enables the measurement of local errors measuring the position of the 
axis of reference holes in the bar. 

We tried to solve the problem of volume calibration with a ball plate, 
developing specific supports for using the ball plate (see figure 5), but we 


always had the problem on how correlate measurements with ball plate 
calibration. 


Those artefacts are effective for local and relative calibrations, but are not 
appropriate for measuring all calibrating parameters of a CMM. In order to 
solve those problems, a new kind of artefact, based on a modification of the 
known structure of the hexapod machine, has been introduced in [1]. The 
proposed artefact was similar to a tripod system constituted by three links 
connecting three fixed spheres to a mobile sphere, with the mobile sphere acting 
as the reference point for CMM calibration (see figure 6, from [1]). 


Hexapod a> Artefact 





Figure 6. The calibrating artefact, hexapod based. 


This kind of artefact needs to be validated, for the purpose of verifying its 
adequacy. Its validation and the measurement techniques use data fusion applied 
to non-linear and non-observable systems. 

Comparing the calculated position of the moving sphere center with the 
corresponding measured position by the CMM in the measurement volume does 
the global calibration of CMMs. Only length variations are used to obtain the 
calculated position and these length variations are measured by three miniature 
laser interferometers installed inside the telescopic links (see figures 7 and 8). 
The artefact uses a self-calibrated method and its modelling is based on laser 
trilateration. 

The problem of global calibration can be divided into two steps: 

1“ step - prediction (geometry identification): knowledge about system 
geometry (localization of fixed points) and tracking (following mobile point) are 
used to obtain geometry identification (that is done by finding the initial length 
of the links /,, the coordinates of the mobile point x,, yj, and the coordinates of 
the fixed points x0,, y0;, that minimize: 


m n 


>>|, +a, )- Go -xJ +00,-»F] (1) 


i=l j=l 
with dl,” representing laser interferometer measurements); 

2™ step - update (system identification): after tracking and identifying 
geometry, the problem is the identification of the mobile point position, in real 
time, using 3D laser trilateration. 


3. Modeling Artefact for Global Calibration of CMMs 


Let us consider three fixed spheres P;, P) and P3, each one connected to a 
mobile sphere Q by a telescopic link (see figure 8). If we know the lengths of 
the lines connecting the center of the fixed spheres to the mobile sphere 
perturbed by known noise and by random noise: 

L; = Loi + Poi + Pui» (2) 
from trilateration, we can find the coordinates of the mobile sphere Q(X YZ ) 


and also the uncertainty of each coordinate (u y,uy,uz ). 


z 





Figure 7. Reference coordinates Oxyz. 


Considering the coordinates from the fixed spheres centers A(X,,%,2,), 
P,(X7,%,Z,), P (X3,Y3,Z3) and assuming Z, = Z = Z = 0, we obtain: 


> The measurements of links increments, dl;, are perturbed, at least, by ambient humidity, 
ambient temperature and atmospheric pressure. Also, measured dl, are estimated with an 
uncertainty composed by the uncertainties of measured perturbations, which probability 
density functions are of unknown type gaussian/non-gaussian and, certainly, constituting 
processes that may be stationary/non-stationary. 


ye XXt- 
2.(X1- X2) 


e a 


Q mobile sphere 
(reference for calibration) 
retroreflector 
telescopic link 





| interferometer 
Fg fixed sphere 





Figure 8. Telescopic links. 


Or, particularly, if we consider the specific coordinates from figure 7, 
R (0,0,0), P,(X2,0,0) and P (X3,Y3,0), we get: 
2 2 2 
y-4 -l + X> À 
2X, 


gd 2 
ger a Lp = Ly? -x Aa 
ee S EN (4) 
2- Y, 


and 


Z=JL2-x?-¥? | 


We use those equations in order to find reference values for the artefact and 
the first set of equations to find, following the GUM, the uncertainty for the 
reference coordinates X, Y and Z knowing the uncertainties for the center of the 


fixed spheres X;, Y;, Zi, X2, Y2, Z2, X3, Y; and Z;, and also for the length of the 
links L, L and L3. 

Following the GUM, the uncertainty uy for a dependent measure 
Y =Y(X,,X2,X3,....X,), a function of n uncorrelated measurements X; with 


uncertainties given by uy; is defined by: 
2 
“.{ oY 
a | SeT ae 5 
Uy H2 ux ( ) 
2 his OY 
In order to bypass the derivatives Pye we are using the difference 
i 


resulting from increments on X; equal to the uncertainty uy;. Consequently, uy is 
given by: 


(6) 


Two problems exist relating to the length of each link L; and also to the 
coordinates for the fixed spheres A(X, %21), 2(%2,%,Z2) and P;(X3,%,Z3). 


uy = 





pa bd 0 Gre Guerre atone Gn =. 0 re TE E are ae 
k=l 


With the laser interferometers we only are measuring length variation. We need 
to estimate the real length for the links, and also the coordinates for fixed 
spheres. We assume that X; = Y; = Z; = Y, = Z; = Z = 0, using the reference 
coordinates from figure 7. All the other values, and also the uncertainties for all 
values, are calculated by self-calibration. 


4. Self-calibration of the Artefact 


Self-calibration (system identification) is done by moving the mobile sphere 
over the plane of the fixed spheres and minimizing the cost function c: 


m 


c=) (a +a +a) (7) 


k=0 
where a; k, a2 k Q3 , are given oy the set of 3(m+ 1) equations, for k=0,...,m: 


Aik = (Xp os KX alee (y Y, est Y, Y, auf + (z Z; _est Z, AA ~(i w +d ;)° 
a, k = Pea ae oe est +(¥, ey a Yy, ew) t Cans = L> est j E (L, est + d, a: (8) 
a3 k = Cae -X3 est + (Yes r Ys est j + (Ze a — La est J= - (L, _ est rte d, a) 
with Xx esi, Yk esn Zk es: the estimated coordinates of the mobile sphere in the kth 
position, L; esn L2 est, L3 est the estimated lengths for the three links at the initial 
position for A=0, and d; ,, dz z d; , the measured length variations of the three 
links. 

For m+I positions of the mobile sphere, we get 3(m+J) equations with 
equal number of measurements and with 3m+13 unknowns (3m+3 coordinates 





for the mobile sphere positions, 9 coordinates for the fixed spheres position and 
3 initial lengths for the links). 

The problem has many solutions, and the fixed sphere’s location enables a 
quick evaluation of the validity of the solution. In figure 9 we present four 
possible solutions for the same inputs, resulting from the evident axis 
symmetrical solutions. The optimal solution search is done using the 
optimization toolbox of Matlab: starting with a constrained method in order to 
fix the solution in the adequate region and concluding with an unconstrained 
method to obtain a quicker and better solution. 
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Figure 9. Those figures present possible evident solutions for self-calibration. 


The cost function value is used to obtain the estimated uncertainty for the 
estimated parameters. 


5. Simulating Calibration Artefact 


In order to verify the adequacy of use of the developed device, some numerical 
simulations were made, using the software Matlab (version 6.5 R13). The 
programs are divided into two major parts: in the first one, the purpose is to 
realize the self-calibration of the system and, in the second, using the results for 
the adjusted parameters obtained in the first part, using trilateration, estimated 
coordinates and the correspondent uncertainty are obtained for some reference 
points. 

The program needs initial parameters that are the minimum and the 
maximum link length, the link’s length variation noise amplitude, the number of 


constrained steps (necessary to constrain the area where the solutions could be 
found, using the optimization method) and forbidden area radius (where no 
points can be placed, corresponding to a physical limitation of the artefact). 
Table 1 summarizes the parameters (in relative units) introduced in the 
presented examples. 


Table 1. Parameters used for simulations. 









[links length noise amplitude | 0 | 0.01 
forbidden area radius | 0.6 


After the introduction of the initial parameters, the user chooses, in a 
graphical way using the mouse pointer, the three fixed spheres’ positions and 
the various moving sphere positions. In figure 10 these positions are presented 
for the 2™ simulation, similar to the other two simulations realized (note that in 
the space of measurement there are some regions of shadow, corresponding to 
zones that are unreachable by the moving sphere). As the user introduces a new 
position, the parameters are recalculated using all information available. 


Parameters 
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Figure 10. Example of fixed spheres position and mobile sphere trajectory during self-calibration. 
After introducing all moving sphere positions, the first part of the program 
is finished and the results of self-calibration are presented, in a graphical and in 
a numerical form. In table 2 we present the final numerical results of 


10 


self-calibration. for the 2"? simulation corresponding to links length noise 
amplitude equal to 0.01. 

The results, in a graphical form, are presented in figure 11, which shows the 
real positions versus estimated positions of the fixed points, the variations of the 
link lengths, the evolution of the cost function that is minimized during the 
program and the evolution of the estimated values for the initial link lengths. 

For the 1* and the 3™ simulations the graphical results of self-calibration are 
shown in figures 12 and 13. 


g plitude 0.01. 


reference value estimated value deviation 


Table 2. Results of self-calibration with links length noise am 





initial links lengths 


x of 2™ fixed point 
x of 3" fixed point 


of 3" fixed point 


1* moving point x 
1* moving point 


2" moving point x 


2™ moving point 


4.14524605719157 
5.77216978117969 
4.80120164387303 
7.64741472782958 
5.87877399052966 
6.74274716485670 


4.14762057356121 
5.76981912447491 
4.81250457805668 
7.64545798330029 
5.89051558442518 
6.75540982406257 


2.76878359671112 
3.08494769311341 
3.35102638810543 


4.19778270569769 


2.77157430333704 
3.08807624576027 
3.36071345716409 
4.19928162592284 


0.00237451636964 
-0.00235065670478 
0.01130293418365 
-0.00195674452929 
0.01174159389553 
0.01266265920588 
0.00279070662592 
0.003 12855264686 
0.00968706905866 
0.00149892022515 





After the self-calibration, it is possible to introduce in the program some 
values of 3D reference points coordinates and compare them with the 
correspondent estimated coordinates, obtained from the final parameters (given 
by the self-calibration process) using trilateration. In addition, those results have 
a corresponding estimated uncertainty, a function of the uncertainty of the links’ 
lengths in table 3 are summarized some results for 4 reference points obtained in 
the three simulations realized. 
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Figure 11. Self-calibration with link’s length noise amplitude = 0.01. 
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Figure 12. Self-calibration with link’s length noise amplitude = 0.0. 
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Figure 13. Self-calibration with link’s length noise amplitude = 0.3. 


Table 3. Trilateration results. 


Links length noise amplitude 


4.000187 + 0.000002 4.00 + 0.02 
4.000031 + 0.000002 4.00 + 0.02 
4.000009 + 0.000003 3.99+0.01 
2.000256 + 0.000002 2.01 + 0.02 
5.000069 + 0.000002 5.00 + 0.02 
7.000042 + 0.000002 7.01 +0.02 
25.000606 + 0.000007 25.05 + 0.08 
24.99971 + 0.00001 24.85 + 0.08 
24.99991 + 0.00001 25.10 + 0.08 
100.00070 + 0.00003 100.00 + 0.2 
-0.00163 + 0.00003 -0.40 + 0.2 
9.9944 + 0.0003 10+2 


Estimated Coordinates 
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6. Reference Values for CMM Calibration — Uncertainty 
Propagation 


Coordinates, and uncertainties for those coordinates, are the reference values for 
CMMs calibration. The coordinates and the respective uncertainties are 
calculated in real time, by a method based on laser interferometer measurement 
of the links length variation. 

For that purpose, we tried several Data Fusion solutions such as the Kalman 
Filter, the Extended Kalman Filter, the covariance intersection, etc. In this 
paper, we present the results obtained using the standard definition for 
uncertainty propagation (due to its extension and unsolved small issues, further 
results will be presented later). 

In figure 14, we represent some of the estimated uncertainty values for X, Y 
and Z coordinates of the moving sphere in some regions of the measurement 
space, for a link length noise amplitude equal to 0.01. As we can see, the final 
uncertainty in each coordinate, not only depends on the link length uncertainty 
value, but also depends on the value of that coordinate and on the values of the 
other two coordinates. 

From figure 14, we can also see that the uncertainties of X and Y slowly 
increase when the coordinate Z from the mobile sphere increases. The 
uncertainty for the Z coordinates increases during trilateration for small values 
of Z and reduces when Z is increasing. Those values are not acceptable and 
trilateration only may be used for values of Z not equal to 0. When the mobile 
sphere is over the plane of the fixed spheres, only the X and Y coordinates must 
be calibrated, as in the case of using the ball plate artefact. 
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Figure 14. Moving sphere coordinates uncertainty in several regions of the measurement space. 
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7. Conclusions 


The measurements made with the use of the described artefact have errors and a 
previously unknown uncertainty. In this text, above, we presented a technique to 
incorporate the measured values uncertainties in order to obtain the calibration 
results’ final uncertainties (using data fusion as a learning system, meaning that 
the optimal uncertainty estimation is constantly adapted during self-calibration 
of the device). However, this is not the final answer for this problem of artefact 
validation: we are developing and comparing different techniques in non-linear 
and non-observable state estimation. 

The results obtained for the final calibration uncertainty using those 
techniques lead us to the following conclusions. First, in order to decide among 
the various techniques, we need to take into account the amount of computer 
memory required for identification, the accuracy of the method, the difficulties 
in implementation and also the ability to represent multimodal probability 
density functions (PDFs). 

In table 4, where KF represents the Kalman filter, EKF represents the 
extended Kalman filter, CI represents the covariance intersection, CP represents 
the covariance propagation, Markov represents the interval functions with fixed 
cells, probability grids or maximum entropy estimation and, finally, PF 
represents various particle filters (the set constituted by Monte Carlo 
localization, bootstrap, condensation algorithm and fittest algorithm), we 
summarize the comparison of the different techniques studied. 


Table 4. Comparing calculating data fusion techniques for uncertain 


good estimation from scratch 









Starting geometry 
knowledge 


ee = more | more | moe | more | ess 
sui = pts fe | ii os 
consuming 
Ability to represent 
l more more 
multimodal PDFs 


As a conclusion, we can state that the particle filters seem to be the best 
methods for our goal. However, the ability to represent multimodal PDFs seems 
to be very restrictive and a better analysis need to be realized taking into 
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account the costs involved (time response, memory consuming, etc.). In other 
way, the Kalman filter and the extended Kalman filter need a starting geometry 
knowledge that can be estimated, measuring the position of fixed points and 
changing coordinates following the artefact coordinate system. 

The results given by using the developed Matlab routines based on the 
above make us conclude that the self-calibration of the artefact works well. The 
modelling of the artefact allows us to develop the real artefact that we are 
starting to build and, in the near future, we will have results from that artefact. 
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UNCERTAINTY IN SEMI-QUALITATIVE TESTING 
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The problem of assigning uncertainty-like statements to qualitative test results 
still remains widely unsolved, although the volume of not purely quantitative 
testing and analysis, and their economic impact are immense. A pragmatic ap- 
proach is developed for the assessment of measurement uncertainty for proce- 
dures where a complex characteristic (e.g. of a material) discontinuously 
changes in dependence on one or more recordable, continuous quantitative pa- 
rameters, and the change of the characteristic is assessed by judgement (which 
may be instrument-assisted). The principles of the approach are discussed, and 
an application example is given. 


1. Introduction 


While calculation, assessment, and statement of measurement uncertainty in 
quantitative testing is required by international standards [1], supervised by 
accreditation bodies [2], and guided by basic documents [e.g. 3] such that the 
process of implementation of the measurement uncertainty concept has been put 
on the wheels in calibration and testing laboratories throughout the world, the 
problem of assigning uncertainty-like statements to qualitative test results re- 
mains widely unsolved. ILAC states that for qualitative testing, consideration is 
still being given as to how uncertainty of measurement applies in such cases, 
and neither the GUM [3] nor the EURACHEM/CITAC Guide [4] provide any 
guidance on how to tackle the problem. 

On the other hand, the volume of not purely quantitative testing and analy- 
sis is immense, and so is the economic impact of decisions taken on the basis of 
qualitative results. Therefore, it seems reasonable to develop feasible ap- 
proaches. Fortunately, nowadays a very large group of qualitative testing proce- 
dures are not as qualitative as it may seem at a first glance. Rare are the situa- 
tions where testing procedures are purely descriptive, or solely depend on hu- 
man senses. Most frequently, one will have a situation where a complex charac- 
teristic (e.g. of a material) discontinuously changes in dependence on one or 
more recordable, continuous quantitative parameters. Examples are e.g. material 
breakdown or detonation of explosive mixtures. 
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Whether or not the characteristic has changed at a certain set of parameter 
values is assessed by judgement (e.g. pattern recognition, colour change) which 
may be instrument-assisted. Such procedures are semi-qualitative in the sense 
that the judgement is qualitative, but the result of the test is nevertheless a fully 
quantified (set of) parameter(s). A certain fuzzyness of judgement induces an 
uncertainty in quantification. 


2. Principle of the approach 


Very much in general, basic approaches like method repeatability and method 
reproducibility remain also valid for semi-qualitative test procedures. Any test- 
ing laboratory may create an estimate of its internal method scatter simply by 
assessing the results of an appropriate number of repetitions. Such experiments 
can be carried out under either repeatability (controlled, constant environmental 
and other influential parameters, same operator) or reproducibility conditions 
(maximum variation of all influential parameters within the limits of method 
specification, and different operators). 

Laboratory intercomparisons may be designed and executed in the very 
same way. Average lab precision and average between-lab bias may be calcu- 
lated from the intercomparison results using established algorithms. 

Beyond the limits of these more classical approaches, a "component-by- 
component” approach is also feasible which will be even closer to the spirit of 
the GUM. The basic element of this approach is also repetition. The characteris- 
tic to be tested is modelled as a Bernoulli random variable depending on a set of 
fully quantified parameters 


6 = fiki ky .. kn). (1) 


The variable may take only two values: 0 (fail) or 1 (success). During the 
test, the status of € in dependence on k is assessed by judgement, and ke deter- 
mined as the value at which the status of ¢ changes. Normally, there will be a 
region of certainty where the outcome of the experiment is undoubtedly 0 or 1, 
and a region in the vicinity of the threshold ke where the outcome is uncertain, 
mainly due to the fuzzyness of judgement. 

Within this region, replicate measurements at each set of parameter values 
are taken, and the parameter values are increased (or decreased) in a stepwise 
manner according to the actual test requirements. With 7 replicates, ¢ at each 
sampling point should follow a B(n, p) binomial distribution with known mean 
and variance. The (changing) probability parameter p is estimated as the fre- 
quency of occurence (or non-occurrence) of the event as observed, and the cor- 
responding variances var(p) are calculated accordingly. 
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A unit step function (with the step at ke) is fitted to the frequency-of- 
occurence data. The variable parameter for the fit is ke. Different minimisation 
criteria are feasible, namely 


i) the normal SSD = min criterion known from OLS regressions. Due to the 
discrete spacing of the sampling and the infinitive first derivative of the Heavis- 
ide function at k., this criterion provides multiple solutions in an interval be- 
tween two sampling points. 


ii) the criterion of equilibrated (i.e. equal) cumulated probabilities (quan- 
tiles) on both sides of the step. This criterion provides a unique solution for k, , 
which will fall into the interval according to i) and, thus, also satisfies the mini- 
mum residual SSD requirement of an OLS fit. 


The criterion according to ii) should be preferred. Since at both sides of the 
step estimates of an average occurence probability are only available at the 
discrete sampling points, quantiles may only be calculated with an additional 
assumption on an appropriate envelope function. Detailed information on the 
envelope is rather unavailable, but pragmatic approaches can be used including 
simple frequency polygons which connect the measured data points in a seg- 
ment-wise manner by straight lines. A disadvantage of this pragmatism is that 
the areas beneath and above the frequency polygon will normally be different 
(i.e. the strict p = / - q rule is violated), but since one will use in practice only 
one type of the areas (the cumulated p) on both sides of the step, the method 
will provide a sensible estimate for the equality point. 

It may also seem reasonable to fit the quantile of the normal distribution to 
the data obtained for p. On the one-hand side, one aims at finding estimates for 
the variability (scatter) of one (or a couple of) value(s) ke, which is/are influ- 
enced by a considerable number of influential factors such that an overall distri- 
bution close to the normal distribution can be assumed. On the other hand, this 
coincides with the philosophy of the GUM [3], which also recommends the 
transformation of different, most often only assumed distributions to Gauss 
distributions, and the corresponding measures of scatter to standard uncertain- 
ties. 

For the given set of data points [p;, k;] one has to solve the minimisation 
problem 


X[pi- quant( ki, ke, s )]? = min (2) 
(with quant being the quantile of the normal distribution) with respect to the 


parameters kand s. This automatically generates the standard uncertainty, the 
confidence intervall is calculated from its multiplication by the corresponding 
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(one-sided) t factor. The above described approach may be further refined into a 
weighted (in y) regression using the inverse variances of the p; as weights. 


With or without such weighing in y, one is still not completely done since 
in a last step, the quantifyable uncertainties of the k; should be taken into ac- 
count by uncertainty propagation [5]. Therefore each k; is separately altered 
by +/- 4u(k), and the minimisation problem solved for k. with a fixed s. The 
contribution of the uncertainty in k; to the total uncertainty is obtained as 


ufke:k)= Ke(ky + Yrutk,)) - ke(ki - utky) (3) 


and will accordingly be summed up for all sampling points. The above de- 
scribed approach will now be examplified. 


3. Example: Explosion limit determination 


The determination of explosion limits according to prEN 1839 is a semi- 
qualitative testing procedure in the sense described here. The characteristic 
under investigation is the detonation of an potentially explosive gas mixture. It 
is assessed by a pattern recognition-like procedure based on the decision 
whether a flame separation or the formation of a tall aureole took place follow- 
ing ignition. The result is a pass/fail decision. Fully quantifyable influential 
variables k are 


1) the gas composition (flammable gas in air) expressed as a fraction or con- 
centration (in the following: mol%). 


ii) the temperature. It is assumed that the characteristic under investigation 
functionally depends on the temperature. However, for the aims of this study 
one may restrict all considerations to one (e.g. a reference) temperature Ty. Ob- 
viously, this temperature can also be measured and controlled only with a non- 
zero uncertainty, but provided the functional relationship is know (e.g. from 
systematic investigations) this uncertainty may easily be propagated to the final 
result. 


iii) the temperature gradient in the detonation vessel. Without any further 
information on its influence on, or a possible insignificance (e.g. within certain 
specification limits) for, the final experimental result this influential factor can- 
not be taken into account properly. 


iv) the ignition energy. There are good reasons for the assumption that the 
dependence of detonation occurence on this factor is of a threshold type. Pro- 
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vided the ignition energy is held well above this threshold, its influence on the 
uncertainty of the experimental result may be neglected. 


Taking the above into account, the problem is reduced to a functional rela- 
tionship of the binomial event occurence on the gas composition x (mole frac- 
tion). 

This composition was altered for a certain gas mixture within the range be- 
tween 13 and 14.5 mol%, and 10 replicate observations were carried out at each 
sampling point. Each observation was assessed either as failure or succes, and 
results as displayed in table 1 were obtained. 


Table 1. Observed number of ignitions at different gas 
compositions 


failure success 


no ignition | ignition 





As already mentioned above, the target criterion is the best fit of the quan- 
tile of the normal distribution to the data which is attained at a step position 
x_exp for the Heaviside function and a standard deviation s of 


x_exp = 13.8034, s= 0.2791 


The goodness of fit may be assessed from figure 1. 

An application of the "equal cumulative probabilities" criterion (using a fre- 
quency polygon for interpolating between the sampling points) results in an 
x_exp of 13.7834. For the standard deviation (33% quantile) one will obtain 
left- and right-hand side values which differ slightly one from the other 


s_left = 0.2833, s right = 0.2067 


and as the geometric mean an s of 0.2480. 
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The thus obtained values for x_exp are satisfactorily close to each other and 
do not exhibit any significant difference (the latter is as low as 7% of the corre- 
sponding uncertainty). There is also no statistical significant difference between 
the calculated standard deviations, and s_left coincides excellently with the s 
obtained form the normal distribution approach. 


quantile ; 
A 





13 13.5 14 14.5 


gas composition in moli% 


Fig. 1: Fit of the quantile of the normal distribution to experimental data. 


Undoubtedly, one might argue whether asymmetric limits for the statistical 
scatter might or might not be an intrinsic feature of such semi-qualitative testing 
procedures. Since there are no widely accepted rules for the handling of asym- 
metric statistical boundaries, for the time being the normal distribution approach 
will be the method of choice which automatically generates symmetric standard 
deviations and confidence intervals. Time will show whether the introduction of 
the Appendix to the GUM dealing with Monte Carlo simulation (a method 
which quite often generates asymmetric boundaries) will change the above de- 
scribed situation. 

Now, the position and the statistical boundaries of the step of the Heaviside 
function are determined. Figure 2 displays the function together with the scatter 
interval at the 67% confidence level. Figure 2 also displays the frequency poly- 
gon used for interpolation purposes between the sampling points, and the left- 
and right-hand side cumulated probabilities. 

The confidence interval which can easily be calculated from a multiplica- 
tion by the t factor for the corresponding number of degrees of freedom and the 
one-sided problem as being C/(95%) = 0.5300 (alternatively: expansion with k = 
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2) covers in a sensible way the major part, but not all of the fuzzyness region of 
the test procedure. 


frequency of occurence 
cumulative occurence 





13 13:5 14 14.5 


gas composition in mol% 


Fig. 2:Heaviside step function (black bold line) at the position x_exp, interpolated fre- 
quency of occurence (grey bold line), cumulative occurence (black dotted line, right 
scale), and statistical boundaries (black arrows} at the 67% confidence level. 


In a last step the uncertainties of the parameter x (mixture composition) 
must be taken into account. The standard prEN 1839 requires that these uncer- 
tainties shall not exceed an absolute value of 0.2 mol% for all gas compositions 
in the composition range above 2 mol%. This absolute value is assigned to all 
sampling points of the experiment, and for uncertainty calculation each value x; 
in the table is now subsequently reduced and increased by a value of one half of 
the uncertainty. The minimisation problem for the quantile is now solved with 
respect to only the x_exp (i.e. with fixed s) for each of the thus obtained, altered 
data sets. 


For the third sampling point the corresponding line in the data table will be 
altered first to 


success 


no ignition |ignition 





23 
resulting in anx_ exp = 13.7948, and subsequently to 
failure SUCCESS 
mol% no an _ 
a 


yielding an x_exp of 13.8217. The contribution of the third sampling point to the 
total uncertainty can now be calculated according to 





u(x_exp: x3) = 13.8217 - 13.7948 = 0.0269. 


After geometric summation of the contributions from all sampling points 
and combination with the fuzzyness-of-assessment uncertainty, one will obtain a 
combined total uncertainty of u,(x_exp) = 0.3433. As expected, the main con- 
tributors are sampling points 5 and 6. 


4. Conclusions 


The approach presented here is a possible way to tackle the problem of uncer- 
tainty assessment in semi-qualitative testing. However, several details and ques- 
tions as raised above need further investigation and clarification. 
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Coherent anomalies can affect the quality of digitized images. In this paper the 
method, which was developed for their removal in images from old movies, is 
proposed in the contest of a metrological application in nanotechnology, where 
accurate thickness measurements of special coatings are obtained after an image 
processing. The method constructs a piece-wise spline approximation to restore 
the corrupted data in the low-pass filtered version of the digitized image obtained 
by a wavelet decomposition. An invisible reconstruction of the damaged area are 
obtained, since the type of the morphology in suitable domains is preserved. Sim- 
ple statistical tests are able to automatically recognize the morphology typand to 
construct the appropriate approximation. The calibration study that have been 
performed for identify the test thresholds is described in the paper. It uses sim- 
ulated images with different types of noise and of scratches. The benefit of the 
method adopted to locally preprocess the metrological image before acquiring the 
measurements is finally discussed. 


1. Introduction 


The digitalized images of nanotechnology surfaces can be affected by co- 
herent anomalies, both coming from the surface itself and arising from 
the digitilizing process. The metrological application in nanotechnology 


*Work partially funded under EU Project SofTools.MetroNet Contract n.G6RT-CT- 
2001-05061 
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is described in [7]: the preparation of the cross-section of the specimen 
causes distortion of the border between different materials; thickness mea- 
surements are obtained by measuring the distance between borders in the 
digitized image. A possible disalignment during the scanning process can 
cause the white vertical defects visible in figure 1. 





Figure 1. Vertical anomalies in a metrological image from nanothecnology: in the left 
side the second white line crosses the lower border (in the digitized image it occurs at 
column 33) 


Another example of vertical anomalies occurs in digitized images from 
old movie. ‘These anomalies are caused by abrasion due to the contact of 
the film material with the mechanical parts of the projector. The vertical 
scratches are physically characterized by the partial loss of the film emulsion 
in a strip, with sidelobes of material; they are present in a image sequence 
and are seen by the audience as bright or dark lines. The anomalies appear 
as lines extending throughout almost the entire image, usually in the same 
position and with constant width. 

In this paper, we propose to apply the strategy developed to remove 
the scratches in old movie images [2], to the digitized image of the metro- 
logical application. The method is able to preserve the image information 
content in the wavelet domain while removing the artifacts produced by 
the scanning operation. Therefore it should be applied before the accurate 
measuring of the coating thickness in the digitized image. In the following 
the linear anomaly will be named scratch as in the example of the old movie 
image. 

A digitized image I is associated to a matrix F = ((fj;)) € RW»). 
The elements f;; are the sampled values of the intensity function f(z, y) at 
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the pixel (i, j) and they represent the Grey-level values of the image in a 
Grey-level scale related to the accuracy of the acquisition instrument. The 
function f is supposed in L?(R*), that is, a signal with finite energy. A 
scratch causes incorrect values in a certain number (say 5-10) of consecutive 
columns of the matrix F. The distortion, l(-,-), is assumed to be additive. 
Therefore the sampled intensity function can be written as 


f(x,y) = f(x,y) + (x,y) + e(z, y). (1) 
where f* is the intensity function of the cleaned image and e is a random 
gaussian noise. 

A scratch removal can be formulated as a corrupted data problem in 
L? (R?) assuming in the damaged columns some information to be still avail- 
able and the features similar to the ones present in the adjacent columns of 
F to be reconstructed in order to recover the information content f* + e. 
The 1D approximation or interpolation methods, applied to each row of 
F, or 2D approximation methods in a strip of F, often provide a “visible” 
restoration, since they only partially deal with the high frequencies: the 
brightness discontinuity can be a feature of the original signal that can 
overlap the high frequency noise in the damaged area. In [1], [2] a new 
approach based on wavelet decomposition of F was used to separate the 
low frequency content from the high frequency details in order to appropri- 
ately treat the vertical features. The authors have used different wavelet 
bases and different approximation strategies applied to the matrices from 
the wavelet decomposition of F. 

In the next section the three-step method developed in [2] is briefly de- 
scribed. To preserve the spatial correlation it adopts the strategy of choos- 
ing the approximation spaces according to the features that are present 
in a small area of the image adjacent to the scratch. The recognition of 
the features is performed by simple statistical tests applied in small areas. 
In Section 3 the calibration of the thresholds, used in the tests, and their 
robustness in presence of noisy images are discussed. In Section 4 the pre- 
processing of the metrological image by the three-step method is described 
and its benefits are shown. In the Appendix a short mathematical de- 
scription of the bidimensional multiresolution analysis and of the filtering 
process are reported. 


2. The three-step method for scratch removal 


We assume the scratch on the image J be in the vertical direction, starting 
from the column k, and with constant width w. The method operates 
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in three steps, namely a wavelet decomposition, an approximation phase 
and the wavelet reconstruction. Some mathematical backgrounds on the 
wavelet decomposition and reconstruction of the first and third step are 
given in the Appendix A. Here only a brief summary of the approximation 
step is reported. 

The first step produces the low-pass filtered version of the original image 
I and the images containing the vertical, horizontal and diagonal details 
of I: they are indicated with the matrices A, V, H, R, respectively. Let us 
assume A € IR{256,256) 

According to the high characterization of the scratch in the vertical di- 
rection, the information relating to the linear defect is confined in A and 
V. Thus the matrices H and R of the horizontal and diagonal details are 
not processed, while in the matrices A and V only the columns correspond- 
ing to the scratch area will be corrected by using also the not corrupted 
information in the adjacent columns. The correction of the columns in V 
is obtained by applying a median filter. 

In A, the data fitting procedure is complex. The parameters of the 
scratch, namely the initial column kı and the width w’, are computed 
from ks, w and the length of the low pass filter, used in the first step. 
For example, w = 5 and the filters ’bior3.1” (MATLAB notation) imply 

r= 

The basic idea is now to replace the corrupted data in the columns of 
A with the values of a suitable bidimensional approximating function, say 
s, defined in the domain 


D = (ky — w', kı + 2w' — 1] x [1,256]. 


It has horizontal dimension equal to 3w’ and is centered on the scratch. The 
function s must be able to preserve the spatial correlation and the features, 
present in a neighborhood of the scratch. This requirement suggested to use 
a piece-wise function whose pieces are constructed according to the local 
morphology of the image. They are defined [2] in the following sub-domains 
of constant vertical dimension d = 5: 


D; = [Ay — w’, ky + 2w — 1] x (5(7 — 1), 59], 7 = 1,2,--- : 


The sub-domains are classified into two types, homogeneous, and not homo- 
geneous: the first type is characterized by small variations of the Grey-level 
values, otherwise in not homogeneous domains some edges are present. The 
criteria to recognize the absence/presence of features (edges) are discussed 
in the next section. 
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The 2D data fitting procedure of the second step constructs each piece 
of s as a tensor product of splines of the same order, namely a bilinear spline 
in homogeneous sub-domains, a bicubic spline, otherwise. In each one of 
the two cases, the optimal knot displacement is identified by means of an 
heuristic procedure described in [2] to solve the non linear mimimization 
problem. 


3. Calibration of the tests for feature recognition 


Let us emphasize that the exact recognition of the type of a sub-domain D, 
enables the method to use the appropriate spline space to get an invisible 
restoration, since the special feature in D}; is preserved. 

The recognition method is based on simple first order statistics associ- 
ated to a minor M of a matrix, namely the standard deviation o(M), the 
mean value (M) and the median m( M). 

Each D; is divided into three equal parts with a horizontal dimension 
equal to w’, D Dig Di , where 4}; corresponds to the central corrupted 
columns, say DF and Da are the left and right neighborhoods, respectively. 
The following minors of the matrix A = ((a;;)) are defined 


Ly = ((au)), il € Ef (2) 
Cj = ((au)), il € X; (3) 
Rj = ((au)), i, l € EF. (4) 


We have observed that the presence of an edge in the central block 
X; corresponds to a significant increase of ø(C;) with respect to o(C;_1), 
while in homogeneous domains the mean and the median value do not differ 
significantly in the left and right parts, JA and P containing uncorrupted 
data. The following quantities are defined: 


6® = |p(Lj)—m(L;)|, 5); = URI) — m(Ry)]. 


The plots in figure 2 show the 6 differences related to the three minors in 
Eqs.(2- 4) for increasing value of j (they pertain to the image in figure 3). 
The behavior of the plots show that suitable thresholds can be identified 
to classify the two types of domains. We have chosen the threshold values 
w1, w2, w3 independently of the image, in order to apply the method to 
several different images. 
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Figure 2. Plots of the difference between the mean and the median values at increasing 
j — index in Ly (top), £; (middle), DR (bottom) 





Figure 3. KNIGHT image in Kokaram CD: domain segmentation of the scratch vertical 
region. In the column on the right side the results of the recognition method are shown 
(1 = homogeneous, 2 = not-homogeneous domain) 


Hence we have defined a domain D; to be non homogeneous if at least 
one of the following tests is verified: 


a(C;) = o(Cj-1) > w1 (5) 
5° > we AND 8P > we (6) 
8P — P] > we. (7) 
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The thresholds have been calibrated on a large class of test images, with 
256 Grey-levels representation, from old movie (Kokaram [6] CD) and also 
from MATLAB images data bank. 

To every image used in the calibration study, we have added simulated 
scratches in order to be able to compare the original unscratched image 
and the restored one. The additive scratch model in Eq. (1) is used with 
the function | deduced by the cross-section function in [9]. For fixed ks, w, 


first column and width of the scratch, in any pixel (7,7),7 = ks,- , ks + 
w—1,j =1,--- , No of the original J the systematic effect is given by: 
3n(t — zte — rd(j —] 
l(i, 7) = 0.5/*-#el À cos ame ee re), Pem Ret — (8) 


where A controls the intensity of the simulated scratch, and rd(-) is a ran- 
dom generator. 

To assess the ability of the feature recognition tests, two criteria were 
used: 

- the SNR (Signal-to-Noise Ratio) between the unscratched image and 
the restored one (with values fž): 


2il á 
RISS PS j 
- the good quality at a visual inspection (in the following named ” eye 
norm”). 

In image analysis it is often necessary to add the ”quality” criterion 
to the quantitative one, since the maximization of the SNR. only does not 
always assure an invisible restoration. Indeed, the SNR is an overall cri- 
terium that may not reveal a defect of systematic type, as the human eye 
can. 

The calibration study provided the following thresholds: w, = 36, wo = 
5, w3 = 7. When the Grey levels of the image belong to a different scale, 
the corresponding thresholds can be obtained by a scaling operation. 

In figure 3 the vertical lines define the domain D and its subdivision 
in y, X; and Dr the horizontal lines separate the subdomains D;. The 
results of the tests, reported in the column at right in the figure, show that 
the type of the D; containing edges has been well classified. 

Let’s now discuss the robustness of the identified values of the thresholds 
of the tests in presence of highly noisy data via a simulation study. We 
have analysed the test images having simulated scratches using Eq. (8) for 
different values of the parameters. For three images in MATLAB database 
and with w = 5, the SNR values are reported in the second column of 
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Table 1. We have generated several noisy images using MATLAB routine 
”imnoise”: white additive Gaussian noise with variance in [0, le—3; 0, 1e— 1] 
and Speckle multiplicative noise with variance in [0,2e — 1;0,4e — 1] are 
considered. Table 1 reports the SNR values for three test images and four 
types of noise. We underline that in every case the eye norm revealed a 
good restoration in spite of the bad values of the SNR function. An example 
of a detail in the Lena image is given in figure 4 where Gaussian noise with 
g? = 0,005 was added. It must be noted that for values of the variance 





Figure 4. Lena image with a simulated scratch in the eye zone and Gaussian noise with 
g? = 0, 005(left), the restored image by the three-step method(right) 


greater than the ones in the Table 1, an higher blurring effect was visible, 
therefore in the damnaged region the domains with systematic features are 
always recognized as homogeneous, thus obtaining a visible restoration due 
to the presence of noise. 


4. The metrological case-study in nanotechnology 


In [7] an accurate measuring procedure for the thickness of special thin 
coatings of different magnetic materials was discussed. ‘The preparation 
of the cross-section of the material and the misalignment of the scanning 
process by a magnetic sensor have produced noisy data in the border be- 
tween the different material layers (for example the vertical scratch most 
visible is in the left side of figure 1). The task of the metrologist was to 
determine the precise contour exceeding fixed magnetic values using the 
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Table 1. Images with a simulated scratch (w = 5) and added noise 


LENA 


TRUCK 
BABOON 





digitized image (Grey-level visualization). The applied mathematical pro- 
cedure was a three level wavelet decomposition with Haar basis to separate 
the horizontal features, and, at each level, a compression was also performed 
by neglecting the vertical and diagonal details: they do not contribute to 
the knowledge of the border position that has a horizontal direction. The 
metrologist measured the distance between two horizontal boundaries (at 
the same Grey-level), by means of a visual inspection of the processed dig- 
itized image. This mathematical procedure emphasizes the horizontal con- 
tours to get the thickness measurements in the Grey-level representation. 
However, neglecting every vertical detail might cause misleading informa- 
tion in predicting a reasonable position of the border when it crosses the 
vertical anomaly. 

We now consider the anomaly that occurs in the left side of the image at 
k, = 33 that has constant width w = 5. To get the measurements using the 
digitized image ( 4096 Grey-level scale), we need to accurately compute the 
horizontal contours: we apply the three-step method, before the multi-level 
wavelet analysis with compression, developed in the PTB Laboratory. The 
benefit due to our approximation step is to preserve the special feature of 
the lower border that also extends in the vertical direction near column 
33. Indeed, some vertical details of the border position, can be recovered, 
which otherwise would be completely lost in the compression operation. 

Table 2 shows the comparison of the results obtained in processing the 
image only by the method [7] and the ones obtained by adding the pre- 
processing with our method. The acquired measures of the borders, Y, 
and Y; are comparable. The main result is that with our preprocessing the 
thickness AY at level 0 (see last row in Table 2) can be measured since 
the upper border has been recovered in a more accurate way. Moreover, 
a similar benefit was obtained when comparing the results after only one 
level of the Haar decomposition and compression. ‘Therefore we can suggest 
the use of our preprocessing and only one level of the Haar decomposition 
with compression, thus saving computing time. 
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Table 2. Thickness (AY) in the scratch area (ks = 33) at 5 magnetic levels: 
columns 2-4 report the values obtained without preprocessing; columns 5-7 the 
values obtained with our preprocessing. 





5. Conclusions 


This paper has addressed the problem of linear defect removal in images. 
The special treatment of these systematic anomalies affecting the image 
quality is useful when the digitized image must be used to obtain accurate 
measurements, such as in the discussed application in nanotechnology. Here 
the three-step method, developed for scratch removal in old movie images, 
was applied as an image preprocessing before the measuring procedure. 

The novelty of the method is a biorthogonal wavelet decomposition cou- 
pled with the construction of appropriate spline approximations to restore 
the damaged low-pass filtered area. The good performance in reconstruct- 
ing the local features that can be present in the image is due to the ability 
of three statistical tests to recognize the specific features in subdomains 
near the damaged area. 

The simulation study, performed to calibrate the thresholds in the recog- 
nition tests, has showed the removal method in wavelet domains to be ro- 
bust also in presence of noisy images. Therefore in the metrological image, 
the local features characterizing the borders crossing a vertical defect were 
recovered providing accurate measures. Moreover, an accurate determina- 
tion of the thickness for a bigger number of Grey-levels have been obtained 
saving computing time. 


Appendix A. Bidimensional multiresolution analysis and 
the filtering process 


The first /third steps of the method in [2] for scratch removal consists of 
a wavelet decomposition/reconstruction in L*(R), where a multiresolution 
analysis [5] (MRA) has been defined. A multiresolution in L?(R?) can 


be easily obtained as tensor product of one dimensional multiresolution 
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analysis. More precisely let 


p(x) = S~hep(Q2e—k), he ER. (A.1) 
kez 


be a refinable function in L?(R), that is the solution of the refinement 
equation. The mask h = {h,} is assumed real and finitely supported, that 
is only a finite number of hy are different from zero: consequently the 
symbol associated to y 


h(z) = > hgz" 
keZ 


is a Laurent polynomial and the function y is compactly supported. 

A basic assumption on y to get a MRA is that the integer translates, 
y(- — k), form a Riesz basis: this hypothesis allows to define a wavelet 
function 


v(x) =X gnp(2e—k), gr ER (A.2) 
kEL 


with a finitely supported mask g = {gx} and the subspaces of L?(R) giving 
the MRA 


V; = span{ai p(n A = pE) 
W; = span{27/24)(Qix — k) =: Wj (x)}. 


Then a bidimensional multiresolution analysis (2DMRA) in L?(R?) is ob- 
tained considering the spaces [5] 


V; =V;@V;, W =V; eW; 
W7 = W; 8 Vj, W5 = W; @ W; 


A function f € V;41 can be decomposed by means the projections into the 
scaling space V; and the remainder spaces W; 


f(z,y) = fj(v,y) +r} (z, y)+r (a, y) +79 (x,y) := f;(x,y)+R;(2,y). (A.3) 


By the inclusion property of the V; the decomposition can be iterated at 
consecutive levels 7 —1,---,j —L 


f(x,y) = fy-x(z,y) + Rj- (a, y) + Ry—-n4i(z,y) + +++ + By(z,y) (AA) 


where any Rk =r} +r? + rt is given by three components. 
Vice versa, given the decomposition Eq. (A.4) it is possible to reconstruct 
the original function f with an inverse procedure. 
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When the bases are not orthogonal a new pair (¢, w) of arefinable and a 
wavelet function, which generates a MRA (V, W; j}, is considered, in order 
to compute the generalized Fourier expansion of a given function [5]. The 
pairs (y, w) and (9, p) are dual or biorthogonal, in the sense that 


< Pim 5! n! hae Ôj j’ N,N’ < Pin >= 0 
< Pijn Vin E Ôj j Ôn, n’, < Pins Dj! n! >= 0. 


Let be h = {he}, g = {g,.} the real and finitely supported masks of ¢ 
and w. The above conditions can be expressed in terms of the symbols 
associated to the refinable and wavelet functions [3], that is 


h(z)h(z—') + h(—z)h(—271) =1, z =e" (A.5) 


(i imaginary unit) with h,h normalized as h(1) = A(1) = 1 and g(z) = 
z“h(—z), g(z) = z*h(—z), with a odd. Both the wavelet decomposition 
and the reconstruction can be interpreted in terms of signal filtering. In 
fact it is known that the sequences {h,} and {h;} can be assumed as digital 
low-pass FIR filters and {g,} and {9,} as digital high-pass FIR filters. In 
this context h(z), A(z), g(z), 9(z) are the transfer function of the filters and 
Eq.(A.5) is the perfect reconstruction condition. Moreover the projection 
operations on the spaces of the MRA reproduce the classical sub-band 
coding [8]. 
Then the filtered versions of a given image obtained by a sub-band coding 
are respectively the projections into the refinable space and into the W 
type spaces (see Eq. (A.3). They can be viewed as function in L?(R?). 
The digitized values of the four images are obtained by convolution 
decimation operators having a tensor product structure 


Pa flu, 0) = 5. f (j, k)hon-j+1h2n'—-k+1 
j,kEZ 


Ph f(u, v) = `S fs, k )hon—j4+1G2n’—k+1 
j, kEZ 


Pyf(u,v) = ` FI, k)92n-j+1h2n' -k41 
j,kEZ 


Paf(u,v) = X. f(O, k)g9an-j+ihan-k+i. 
j,kEZ 


The dual filters h = {hg}, & = {Gx} allow to introduce the analogous 
adjoint operator P% Př P% Př involved in the reconstruction operation. 
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Indeed, it is the tensor product structure that allows to separate the 
high frequency details according to the horizontal, vertical and diagonal 
directions. 

In conclusion, a digital image represented by a matrix F can be decom- 
posed into four matrices by the operators Py: the matrix from the operator 
P,, say A, is the low-pass filtered version of F; the matrices from the op- 
erators P, and Ph, V and H, contain the vertical and horizontal details 
of the image; the matrix from the operator P4, R, contains the diagonal 
details. 

In the three-step method [2] the biorthogonal B-spline were suggested 
since they show a better visual behavior in the restoration than the Haar 
basis, generally adopted. To keep the computational cost low, bases with 
minimum support are to be preferred, for example the ” bior2.2” or ” bior3.1” 
filters (MATLAB notation). 
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Least squares methods provide a flexible and efficient approach for analyzing 
metrology data. Given a model, measurement data values and their associated 
uncertainty matrix, it is possible to define a least squares analysis method that 
gives the measurement data values the appropriate ‘degree of belief’ as specified 
by the uncertainty matrix. Least squares methods also provide, through x? values 
and related concepts, a measure of the conformity of the model, data and input 
uncertainty matrix with each other. If there is conformity, we have confidence 
in the parameter estimates and their associated uncertainty matrix. If there is 
nonconformity, we seek methods of modifying the input information so that con- 
formity can be achieved. For example, a linear response may be replaced by a 
quadratic response, data that has been incorrectly recorded can be replaced by 
improved values, or the input uncertainty matrix can be adjusted. In this paper, 
we look at a number of approaches to achieving conformity in which the main 
element to be adjusted is the input uncertainty matrix. These approaches include 
the well-known Birge procedure. In particular, we consider the natural extensions 
of least squares methods to maximum likelihood methods and show how these 
more general approaches can provide a flexible route to achieving conformity. This 
work was undertaken as part of the Software Support for Metrology and Quantum 
Metrology programmes, funded by the United Kingdom Department of Trade and 
Industry. 


1. Introduction 


Least squares analysis methods are widely used in metrology and are justi- 
fied for a large number of applications on the basis of both statistical theory 
and practical experience. Given a model specified by parameters a, mea- 
surement data values and their associated uncertainty matrix, it is possible 
to define a least squares analysis (LSA) method that gives the measurement 
data values the appropriate ‘degree of belief’ as specified by the uncertainty 
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matrix. LSA also provides a concise method of evaluating the uncertainty 
matrix associated with estimates of the fitted parameters. 

Through y? values and related concepts, it is possible to give a measure 
of the conformity (or consistency) of the model, data and input uncer- 
tainty matrix with each other. If there is conformity, we have confidence in 
the parameter estimates and their associated uncertainty matrix. If there is 
nonconformity, we seek methods of modifying the input information so that 
conformity can be achieved. Nonconformity indicates that the ensemble of 
input information consisting of both model assumptions and observations 
represents an unlikely state and that parameter estimates and their associ- 
ated uncertainties based on this input information may therefore be invalid. 
By reassessing the input information in the light of the nonconformity, we 
may be able to develop an alternative, more self-consistent interpretation 
of the input information or to quantify the changes in the information that 
would be required to bring about conformity. 

In section 2, we summarize the main elements of LSA along with solu- 
tion algorithms. In section 3 we look at a number of approaches to achieving 
conformity in which the main element to be adjusted is the input uncer- 
tainty matrix. These approaches include the well-known Birge procedure. 
In particular, we consider the natural extensions of least squares methods to 
maximum likelihood methods and show how these more general approaches 
can provide a flexible route to achieving conformity. We also indicate in 
this section how these procedures can be implemented in a numerically 
stable way. We illustrate these procedures on a number of applications in 
section 4. Our concluding remarks are given in section 5. 


2. Least squares analysis 


Suppose we have a linear model in which the responses 7 = (71,..-;%m)* 
of a system are determined by parameters œ = (a1,...,Qn)! through a 
fixed, known m x n matrix Č so that n = Ca. We assume that m > n and 
C is of rank n. We wish to determine estimates of the parameters a from 
measurements of responses n;. Suppose the measurement model is 


Y=n+E£8 (1) 


with E(E) = 0, Var(E) = V, and that we observe measurements y of Y. 
If the uncertainty matrix V has full rank, the least squares estimate a of a, 
given y, minimizes x? = (y — Ca)TV-+(y — Ca). For the case V = I, the 
identity matrix, a = Cty, where Ct = (CTC)—!C7 is the pseudo-inverse 
of C. While the estimate a can be calculated thus in terms of C, a better 
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approach®’” is to factorize C as C = QR, where Q is an m x m orthogonal 
matrix with QTQ = QQT = I and R is an m x n upper-triangular matrix. 
Writing 


C= QR = [Qi Q2] a = QR, 


we see that C! = R;‘qQ?t and a hence solves the n x n triangular system 
Ria = Qf y. Here Qı (Q2) is the submatrix of Q comprising the first n 
(last m — n) columns. 

For a general (full rank) uncertainty matrix V with a factorization V = 
LL" (cf. section 3.4), where L is an m x m matrix, also necessarily full 
rank, the least squares estimate is given by 


H=Cly, Cer O g=L Yy 


where C' is the pseudo-inverse of C. For well conditioned V and L, this 
approach is satisfactory. However, if L is poorly conditioned the formation 
and use of C, etc., can be expected to introduce numerical errors. The 
generalized QR factorization*!®™1} approach avoids this potential numerical 
instability. Suppose V = LLT, where L is m x p. (Often p = m but the 
approach applies in the more general case.) The estimate a can be found 
by solving 


T 


min z z subject to constraints y = Ca + Lz. (2) 


a,Z 


Note that if L is invertible, 
z=L'(y—Ca), z'z=(y—Ca)'V'(y —Ca). 


We factorize C = QR and QT L = TU where R and T are upper-triangular 
and Q and U are orthogonal. Multiplying the constraints by QT, we have 


e= Po Jet [7 mal Le) e 
y2 0 Toz | [Z2 

where y = QTy, and ž = Uz. From the second set of equations, Ž2 must 
satisfy Yo = Tz2Žə. Given any Z1, the first set of equations is satisfied 
if Ria = yi — T1121 — Ty2%2. We choose Žž; = 0 in order to minimize 
zľz = 2°% = 212, + ZZ, so that a solves Rua = ÿı — TieZ. Public- 
domain library software for solving (2) and, more generally, computing 
generalized QR factorizations is available'!. 


40 


2.1. Uncertainty matrix associated with the least squares 
estimate 


For fixed C and Y an m-vector of random variables, the equation A = CY 
defines the n-vector A of random variables as linear combinations of Y. 
Taking expectations, we have 

E(A) = E(C'Y) = C'n = C'Ca =a, 


and 


V4 = Var(A) = CIV(CT)T = (CTVC. 


2.2. Conformity 


Consider first the case where Var(Y) = I. If X; ~ N(0,1), i= 1,...,m, 
then >>)", X? has a x2, distribution with E(x7,) = m and Var(x%,) = 2m. 
Let R be the random vector of residuals so that 


R=Y -CA=Y -COY =(I-CC)Y. 
If C = Qı Rı, then CCt = Q, QT and I — Qi QT = Q2Q7, so that 
S? = R™R=(QEY) QTY. 


Now Q is orthogonal so setting Y = QY we have Var(Y) = I also. There- 
fore, S? = ead Ý? is a sum of squares of m — n normal variates and 
has a x? distribution with v = m — n degrees of freedom, with E(S7) = v, 
Var(S*) = 2v. In general, given a least squares estimate a = a(y,V) 
associated with a conforming model and data, we expect 


x? = (y — Ca) V~} (y — Ca) x v. 


If x? is far from v (relative to (2v)!/2) we may wish to reassess y, V, a and 
V4. 


2.3. Example applications 


Interlaboratory comparison exercises. In an interlaboratory comparison ex- 
ercise, one or more artefacts are measured by a number of laboratories, with 
the ith laboratory recording a measured value y; , for the kth measurand 
a, and an associated standard uncertainty u; g. From {yi} and {uik} we 
generally wish to determine estimates y, of the measurands a, and their 
associated uncertainties ug. The analysis of such data is often complicated 
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by the fact that the dispersion of the {y;,,} cannot be accounted for easily 
by the stated uncertainties {u; ;}. 

Adjustment of the fundamental constants. There are many fundamental 
constants of physics, a small number of which such as the speed of light 
in a vacuum have exact assigned values, the others being estimated from 
measurements provided a large number of experiments. Generally, each 
experiment provides an estimate y; of n; = ci(a), a quantity related to 
some subset of the fundamental constants a, and an associated standard 
uncertainty u;. From this ensemble of measurements’, estimates of the 
constants are found by solving a large nonlinear least squares problem. The 
last such adjustment exercise was undertaken in 19998 and took account of 
all the relevant data available up to 31 December 1998 relating to over 50 
constants. In such exercises, the conformity of the model and input data 
is of great concern since, due to the complex functional inter-relationships, 
an invalid input could have a significant effect on a number of parameter 
estimates. 


3. Adjustment procedures 


In this section, we consider ways of adjusting the input information y and 
V in order to achieve conformity of the least squares solution. We regard 
the least squares estimate a = a(y, V) as a function of the measurements 
y and the input uncertainty matrix V. Similarly, 


x? = x?(y,V) = (y — Ca)" V~} (y — Ca) 

is also a function of y and V. We look for adjustments (y, V) — (y, V) SO 
that x? = x?(y, V) = v (or acceptably close to v in terms of the distribution 
x2). Rather than allowing complete freedom to adjust y and V, we may 
consider a modest number of degrees of freedom parametrized by A = 
(A1,-..,Ap)™ so that ¥ = H(A), V = V(A) and x? = y2(A). We seek 
adjustments that, in some sense, make minimal changes to y and V in 
order to bring about conformity. Our definition of minimal will reflect our 
belief in the input estimates y and V and the underlying model. 

From a computational point of view we would like some measure || e || 
so that the adjustment problem can be posed as 


MEN) NO): PA- VI (4) 
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subject to the constraints 


CTP-1(A)Ca = CTP-MA)S(A), 


(¥(A) — Ca) T-UAN (SA) — Ca) = v. (5) 


The first constraint states that a is the least squares solution associated 
with (y(X), V(X), while the second is the conformity constraint. Below, 
we usually assume that only V is to be adjusted, but all the algorithms can 
be adapted for the more general case. 


3.1. One-parameter adjustment 


In the case where we allow only one degree for freedom, parametrized by 

A, for adjustment, the conformity constraint specifies A completely (if we 

ignore the possibility of multiple solutions) and no measure ||e || is required. 
Birge procedure:. In the Birge procedure, 


V(A) =v, (6) 


representing a re-scaling of the input uncertainty matrix. If r is the vector 
of residuals associated with y and V, then A = v/(r*V~'!r). This procedure 
reflects a belief that the input uncertainty matrix is known only up to ascale 
factor and is often used in the case where V = o7I, where o is unknown. 
The estimate a remains unchanged. 

Hidden independent random effect!3. In this procedure, V(A) = V +A, 
corresponding to the belief that each observation was subject to hidden, 
independent random effects with zero mean and unknown variance A. The 
associated measurement model is 


Y=n+E+F, E(F)=0, Var(F) =X. (7) 


Parameters a and à are specified by the nonlinear equations (5). We note 
that if V has eigen-decomposition’ V = U DUT, with U orthogonal and D 
diagonal, then V + AJ = U(D + ADUT and 


VEAN SUDAN UR, 


In general a(A) Æ a(0) = a, but, for the case V = øg°I, this procedure is 
equivalent to the Birge procedure. 

Hidden correlated random effects'*. Let r be the vector of residuals 
associated with y and V and set V() = V+Arr!. Doing so corresponds 
to the measurement model 


Y=n+E+Fr, E(F)=0, Var(F)=). (8) 
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In this case, a(\) = a. 

All of the above procedures have the form V (A) = Vo + ÀAVı. We note 
that, from the generalized eigenvalue decomposition’ for the pair (Vo, V1), 
we can find a nonsingular matrix X such that Do = X?VoX and Dı = 
XTY,X are diagonal matrices. It follows that 


V-1(\) = X(Do + AD1) XT. 


3.2. Maximum likelihood approaches to adjustment 


For adjustment procedures involving parameters A = (Aj,..., Ae) p>, 
it is necessary to define a measure M(A) as in (4). Rather than define 
ad hoc measures, we consider ones having a probabilistic justification. We 
make the further assumption about the model (1) that E ~ N(0,V(A)). 
Then, given data y € Y, the probability p(y|a, A) that the data arose from 
parameters (a, A) is given by 


pria) < yeca y- Ca) } | 


1 
= = se ee 
2r V (A)|1/2 í l 2 
and the log likelihood L(a, Aly) by 


—L(a, Aly) = 5 log |21? (A)| + 5(y - Ca)" 1(a)ly — Ca). 


The maximum likelihood (ML) estimates of a and A are determined by 
maximizing the probability p(y|a, A) which is equivalent to minimizing 
—L(a,Aly) with respect to a and À. 

From its definition, an ML approach finds the most likely explanation 
of the data. To the extent that the quantity x? — v is also a measure of 
the likelihood of the data, we generally expect ML methods to produce 
a x? value that is as reasonable as the adjustment procedure will allow. 
For example, if V = AV, the ML estimate is Am = r'ér/m, whereas 
the conforming \c = r'r/(m—n). The more flexibility there is in the 
adjustment procedure, the ‘easier’ it is for an ML estimate to satisfy the 
conformity constraint. However, increasing the flexibility also means that 
the ML optimization problem is more likely to be ill-posed. For example, we 
can apply an ML approach for the case V(A) = \,V + à where we allow 
both the possibility of re-scaling and hidden random effects. If V is close to 
a multiple of the identity matrix J, the corresponding estimation problem 
is poorly posed since the two parameters A, and Àz specify essentially the 
same behaviour. In these situations, we can augment the log likelihood 
function with a regularization term R(A) in order to help resolve any such 
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ambiguity. Again, we would like the regularization term to encode, in a 
probabilistic way, our prior belief about the likely range of values for X. 

With this generalization, we can then calculate the most likely conform- 
ing solution by solving 


min log IV(A)| + RA) (9) 


subject to the constraints (5). 


3.3. Algorithms for diagonal uncertainty matrices 


We now consider the case where the uncertainty matrix V is diagonal and 
the adjusted V(A) is also constrained to be diagonal. Let D = D(A) = 
V-A). If ý = Dy, Č = D/C, and Č = QR is its QR factorization, 
then a solves the linear least squares problem Ca = jy. That a is constrained 
to be the least squares solution associated V(X) can be stated as 


CTD(A)Ca = CTD(A)y (10) 


and defines a = a(A) implicitly as a function of A. If D and a denote the 
derivative of D and a with respect to one of the parameters Àq, then 


CT pCa = C'D(y — Ca), 
defines å in terms of D. We note that this system of linear equations can 
be solved by solving two triangular systems involving R. 
The associated log likelihood LE for a data set y is given by 
1 TA2 
—L(a, Aly) = 5 5 Datos 2mi(2) + ORA” ~ cla). 
Consider the case where 6;;(A) = Aj, i = 1,...,m, and our belief in the 
input V is encoded as 


log vy, ~ N(log à; (log p), p> 1. 


Roughly, this says that we are 95% certain that d;/p* < vig < `p. (We 
might prefer to use, for example, a gamma distribution instead of a log 
normal distribution to represent our confidence in V, but the general ap- 
proach would be essentially the same.) An appropriate regularization term 
is therefore 


R(A) = ‘ona De vii — log À; D 


oa? 
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The generalized ML estimate is found by solving 


miaz | Sloe EO y; — cfa) PARAN 


(If we believed that vj; and y; were observations associated with in- 
dependent random variables, then this objective function is essentially 
—L(a, Aly, V), where L is the log likelihood function. In practice, v; and y; 
are quite likely to be jointly distributed in which case the joint distribution 
would be used in the definition of the log likelihood function.) The most 
likely conforming estimate is found by solving 


ee 
mS {Solow +e} (11) 
subject to the conformity constraint 
1 
Sx (yi — ef a(A))? =v, 
7 Ai 


regarding a = a(X) as defined implicitly by (10). If required we can impose 
additional equality or inequality constraints on ;. For example, if we 
believe that no v; is an underestimate we can impose the constraints 


NES Vii, / od ere 6 LF (12) 


3.4. Algorithms for general uncertainty matrices 


For the case of general uncertainty matrices, we wish to use a generalized 
QR factorization approach to avoid potential numerical instabilities asso- 
ciated with forming and using the inverses of matrices. We also need to be 
able to calculate the gradient of the objective and constraint oe in 
(9) in order to employ efficient constrained optimization algorithms? 

Any symmetric positive definite matrix V has a Cholesky pe tation 
of the form V = LL", where L is a lower triangular matrix. The matrix 
V can similarly be factored as V = TTT where T is an upper-triangular 
matrix. These factorizations can be computed in a numerically stable way 
using a simple algorithm. For example, consider the factorization 


ryt — ” red k | _ k _y 
t22 | | tio t22 Via V22 


Equating terms we have De = V22, toot12 = V12 and Tats = Vi = tioth. 
The problem is now reduced to finding the factorization of the modified 
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submatrix Vi1 — tit}. The complete factorization can be achieved by 
repeating this step: 


I Set T,9) = V (i,j), for all i > j. 
II Fork =n: —1 : 1, set T(k,k) = T(k,k)'/? and 


T(1:k-—1,k) =T1:k—-1,k)/T(k,k). 
II.j For j = k — 1: —1 : 1, set 
T(1: 4,4) =n: 
Suppose now that V = V(A) = T(A)TT(A), and V = BX and T = 


Be The matrix T satisfies TTT + TTT = V and can be determined by 
differentiating the algorithm? to compute T: 


I Set T(i, 7) = V(i,7), for all i> j. 
II Fork =n: —1:1, set T(k, k) = T(k,k)/(2T(k,k)) and 


T(1:k—1,k) =(T(1:k—1,k) —T(k,k)T(. : k—1,k))/T(k,k). 
II.j For 7 = k— 1: —1: 1, set 
T(1: 9,9) =T(1: 9,9) -T(1: j, kyD U, k) — TL: j, kT, k). 


This algorithm can be easily vectorized to compute all the partial deriva- 
tives of T(A) simultaneously. 

Turning to the general problem (4-5), let C = QR, set y(A) = QTY(A) 
and factorize QV(A)Q™ = T(A)T™ (A). Corresponding to (3) we have 


a= Lola male) a 


and from the analysis above we know that if yo(A) = To2(A)Ze then the a 
that solves Ria = yi(A) — Ti2(A)Ž2 is the least squares solution associated 
with y(A) and V(A). Moreover the solution a will satisfy the x? constraint if 
ZZ = v. Since this latter constraint does not involve a we can reformulate 
(4-5) as 
min M(X) subject to V2(A) = T22(A)Ze, Za Zo’ = V. 
Z2 
In this formulation, there is no requirement to form the inverse of a matrix 


and the partial derivatives of the constraint equations are easily determined 
from the partial derivatives of y(A), V (A) and its factorization. 
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4. Numerical examples 


In this section we compare the behaviour of four adjustment procedures: 
A) Birge procedure (6), B) hidden independent effects (7), C) hidden corre- 
lated effects (8) and D) a maximum likelihood approach (11) incorporating 
constraints (12) to ensure that no input uncertainty is reduced. The latter 
procedure was implemented straightforwardly using the NAG? constrained 
optimization routine EOQ4UCF. Below uii = vif 2 and Use = ot A the input 
and adjusted standard uncertainties associated with the observation y;. 


4.1. Measurements of G 


The first set of data concerns the ten measurements of the Newtonian con- 
stant of gravitation G considered by Weise and Wöger!?. Fig. 1 plots the 
estimates y; and corresponding coverage intervals y; + 2u;; and y; + 2%,; as- 
sociated with the input (first interval) and three adjusted uncertainties cor- 
responding to algorithms B, C and D. Observations 4 and 7 have relatively 
large input uncertainties. Observation 10 has a small input uncertainty 
but a value considerably different from the other measurements. In order 
to achieve conformity using procedure A, all the input uncertainties have to 
be multiplied by 22.6. Procedure B achieves conformance by large relative 
increases to all the uncertainties apart from those associated with observa- 
tions 4 and 7. Procedure C has a more modest effect on all the uncertainties 
except that for observation 10, which is multiplied by over 20. Procedure C 
also introduces appreciable correlations (with many correlation coefficients 
over 0.9 in magnitude), which are not apparent in Fig. 1. The main effect 
of procedure D is to increase the uncertainty associated with observations 
5 and 10 by factors of nearly four and over 30, respectively. 

Another view of the behaviour of the adjustment procedures is given in 
Fig. 2 which plots the residuals r = y — Ca and corresponding intervals 
r,2u;; associated with the least squares fit to the input data and f; + 2u,; 
associated with the fit for the three adjusted uncertainties corresponding to 
algorithms B, C and D. Only with procedure D are the residuals scattered 
around zero. For all the other procedures the fit appears skewed by the 
effect of observation 10. More recent measurements of G provide more 
evidence that observation 10 is discrepant. 


4.2. Simple adjustment model 


To investigate adjustment procedures for determining estimates of the phys- 
ical constants we consider a much simpler model involving five (fictitious) 


6.78 


6.76 


6.74 


6.72 


6.7 


6.68 


6.66 


6.64 





Figure 1. Input and adjusted uncertainties corresponding to procedures B, C, and D 
and for the 10 measurements of G. 


physical constants a = (a1,a2,03)' and 8 = (81, G2)" and nine measure- 
ments with associated observation matrix 


100 100 5 0 —ö 

100 —100 0 5 -5 

100 0 100 —5 9 

100 0 —100 —5 9 

C= 0 100 100 5 —5 
0 100—100 5 ð 

3 0 0 100 100 

0 0 —5100 100 

0 5 0 100 —100 


The first six (last three) observations contain strong information about a 
(B). Given y = (aT, 8")T, we generate exact data according to 7 = Cy 
and then simulate measurement data by adding perturbations y = 7 +e 
according to some model. For example, we have generated exact 7 for 
y; = 1, j = 1,...,5, added random Gaussian noise to 7, yi = Ni + Gi, 
ei E N(0,07?), o; € [0.5, 3.0], and then subtracted 10 from yz, simulating a 
discrepant value. Fig. 3 shows the perturbations y — 7 and corresponding 
intervals y; — jj + 2uy; and yi — ni + 24;; associated with the input (first 
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Figure 2. Residuals r = y — Ca along with error bars r; +2u;; associated with the least 
squares fit to the input data and 7; + 2t;; associated with the fit for the three adjusted 
uncertainties corresponding to algorithms B, C and D. 


interval) and four adjusted uncertainties corresponding to algorithms A, B, 
C and D. Both procedures A and B result in a large increase in all the input 
uncertainties, with procedure B leading to a more uniform set of uncertain- 
ties. Procedure C has a different behaviour with significant increases in the 
uncertainties associated with observations 2, 3, and 5 and small or modest 
increases for the others. Procedure D produces a large increase in the un- 
certainty associated with the second (discrepant) observation and modest 
increases associated with observations 8 and 9. 

The residuals r = y — Ca for the fits associated with procedures A, B 
and D are plotted in Fig. 4 along with the perturbations y; —;. (Procedure 
C gives the same fit as procedure A.) By assigning a large uncertainty to 
observation 2, procedure D is able to tolerate a large residual. The residuals 
for D approximate well the perturbations y; —;. The largest residual error 
associated with procedure A is for observation 5 and there is no indication 
that observation 2 is discrepant in any way. The behaviour of procedure B 
is somewhere between A and D. 
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Figure 3. Input and adjusted uncertainties corresponding to procedures A, B, C and D 
for the nine observations with one discrepant data point. 
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Figure 4. Residual errors associated with procedures A, B and D along with perturba- 
tions y; — ni (joined by a line). 
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5. Concluding remarks 


In this paper we have been concerned with procedures for adjustment of 
input information to achieve conformity of the associated least squares so- 
lution. We have provided a general framework for such procedures and 
described algorithms to bring about conformity. These algorithms avoid 
potential numerical instabilities by using appropriate matrix factorization 
approaches. We have been concerned to define procedures that have a prob- 
abilistic foundation rather than ad hoc methods. In particular, we have de- 
scribed adjustment methods based on maximum likelihood estimation that 
combine flexibility with an appropriate use of the available information. 
Acknowledgments. The authors are grateful for the generous support 
from SofTools MetroNet and to attendees for many helpful discussions. 
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During the analysis of natural gas by gas chromotography, the indicated response 
is affected systematically by varying environmental and instrumental conditions. 
Calibration data dispersion is attributed not only to random effects, but also to 
systematic measurement effects. The accuracy of the measurement results and 
consequently the usability of a calibration model are accordingly reduced. The 
model consists of a series of calibration curves, one for each component of the 
gas measured. When the systematic uncertainty component dominates, a corre- 
lated response behaviour between corresponding points on which the set of curves 
depends is observed. A model-based least-squares method is introduced that com- 
pensates for these effects. It incorporates correction parameters to account for the 
systematic uncertainty components, thus eliminating the correlation effects and 
reducing the calibration model uncertainty. Results are presented for calibration 
data modelled by straight-line functions. Generalizations are indicated. 


1. Introduction 
1.1. Measurement process 


N samples, each from a different standard, that consist of the same q gas 
components, are injected in turn to a gas chromatograph (GC) (figure 1). 
A flow of carrier gas (e.g., helium) supports sample circulation through the 
column of the GC. During each run, corresponding to the analysis of one 
standard, the injected gas component molecules are naturally separated. 
The detector within the GC indicates any deviation from the carrier gas 


*Work partially funded under EU project SofTools.MetroNet Contract N. G6RT-CT- 
2001-05061 and supported by the Valid Analytical Measurement and Software Support 
for Metrology programmes of the UK’s Department of Trade and Industry. 
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condition. It provides spectral peaks, one for each gas component present 
(figure 2). The area under each peak relates to the molar fraction of gas 
component measured. 


injector Detector 





Carrier Gas out 






Carer Gas in 


Figure 1. Schematic representation of the gas chromatograph. 


The application of the Ideal Gas Law gives 


Nig = Pe tip (1) 
where ni; is the number of molecules of the jth gas component participating 
in the molecular interaction in terms of x;;, the amount fraction (gravimet- 
ric concentration) of that gas component in the ith standard, V, the vol- 
ume of the sample loop, R, the ideal gas constant, p;, the ambient pressure, 
and T;, the temperature during the ith measurement run. Expression (1) 
applies for 7 = 1,...,q, the indices of the gas components processed by the 
detector, where q < Q, the total number of gas components constituting 
the sample, and i = 1,..., N, the indices of the standards used. 

xij remains fixed by definition, since it is the relative abundance of gas 
molecules while, if n; is the total number of molecules injected to the GC 
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Figure 2. Spectral peaks provided by a gas chromatograph. 


for the ith standard, 


Q 
aen a 
i m ao iS ij- 
i r 
j=l 


Changes in the environmental and instrumental conditions (e.g., ambient 


pressure, temperature) between measurement runs affect the numbers Nij 
of injected molecules, but not their concentration ratios: 


Maj _ Mag /M _ ay 
Nik TNik/Mi Lik 
If f; denotes the relative response factor of the detector, the theoretical 


area y;; recorded by the detector for the jth gas component in the ith 
standard is, using expression (1), 


pi V 
vig = fin = fie R (2) 


1.2. Experimental considerations 


The changes in the environmental and instrumental conditions between 
runs that affect the n;; accordingly influence in a systematic manner the 
indicated areas y;; for all q measured gas components. Let €;; denote the 
random error, regarded as a signal from an unknown source, associated with 
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the measurement result y;j. Let u(v) denote the standard uncertainty asso- 
ciated with an estimate v of the value of a quantity. When the systematic 
uncertainty component dominates the data, Le., 


u(pi/T;) > uleiz), (3) 


a correlated response behaviour—termed covariation—is observed among 
the placement of the calibration curves for all q gas components participat- 
ing in the ith experimental run. Conversely, when the random uncertainty 
component dominates the calibration data, the inequality (3) is reversed, 
and the indicated area values are randomly dispersed about the derived 
calibration curves. 

The objective is to adjust the measurement results to eliminate the 
systematic effects. The adjusted calibration data would then deviate only 
randomly from the fitted model. Consideration is thus given to the ap- | 
plication of multiplicative correction factors to the standards to improve 
the consistency of the indicated responses y;;. The use of the corrected 
data set reduces the uncertainty associated with the calibration curves and 
achieves an improved simultaneous model fit. One purpose of this work is 
to quantify that reduction for actual data. 

The calibration data consists of stimulus values x;; and corresponding 
response measurements Yj, t= 1,..., N, for each gas component (indexed 
by) j =1,...,q. This data generally exhibits an underlying proportional 
behaviour (as in figure 3): 


pi V . 
ra egg F Eia Tela N: 4 
Yij Lit R ij ij (4) 


Expression (4) constitutes the measurement model used. 

Figure 4 shows a calibration data set free from covariation, 
where ul yi; ) ~ u(Ei;). 

Figure 5 shows an example of covariation between data points. Corre- 
sponding data points are influenced similarly, tending to be either ‘high’ or 
‘low’ together. Following correction it is expected that u(yi;) ~ u(é:;), i.e., 
that the uncertainty is essentially associated only with a random process. 

The determination and use of straight-line calibration curves is consid- 
ered in section 2. Numerical methods for constructing the calibration curve 
parameters are the subject of section 3. The uncertainties associated with 
the calibration curve parameters and with the molar fractions obtained 
from the use of the calibration curves are addressed in section 4. Results 
obtained using the approach are given in section 5. Possible extensions 


56 


Area Nitrogen 





0 5 10 15 20 25 wn 
Nitrogen Gravimetric concentration 


Figure 3. Typical calibration data, having an underlying straight-line behaviour. 
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Figure 5. An example of covariation between data points 
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of the approach, to higher-order calibration curves and other uncertainty 
structures, are indicated in section 6. 


2. Determination and use of the calibration curves 


The determination of the calibration curves is considered first without ac- 
counting for systematic effects, and then the required corrections are intro- 
duced within a multiplicative model to derive improved calibration curves. 

The calibration curves are then used (inversely) to obtain the molar 
fractions x}, say, from values of response (area) Yj. 


2.1. Straight-line model 


In practice, the detector response may exhibit a small offset as well as the 
proportional effect implied by expression (4). The linear regression model 
equation ! 


Yij = OF + Bei + Nij (5) 


is therefore used to represent this behaviour, in which a; and b; are the 
offset and gradient estimates when measuring the jth gas component, and 
the error processes ņ;; comprise both random and systematic effects. The 
calibration curves are formed by solving 4 


N 


min (Yij = (a; + Bizy)’, j == E 


005 8; 
II | 


for each gas component separately, or, equivalently, simultaneously by solv- 
ing 


q N 
min >> (vig — (ag + Byaig))” 


0,8 jiin 
where œ = (a1,...,Q@q)* and B= (Birsa hal s giving parameter esti- 
mates @ = (&7,...,&})! and 8 = (fi,...,8y)", and a resulting sum of 


squared deviations 


q N 


S = S Y (yi — (aj + bjt). (6) 


j=1li=1 
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2.2. Multiplicative model 


In order to derive improved calibration curves, introduce multiplicative 
correction factors c; so that 
Pi _ Po 
G= = 
T To 


for some ‘reference’ values pp and To. Then, vans expression (2), 


1 
= 0 
Vij = z fij» Aa Lig = CiYij, 
i 


where the YP constitute corrected values of y;;. The modified model assigns 
the degree c; to which the fractions p;/T; deviate from the ideal constant 
fraction po/To. Such corrections compensate for systematic measurement 
effects during the measurement process. 

The linear regression model equation, including offsets as before, and 
regarding the random effects as following a normal distribution with mean 
value zero and variance oĉ, becomes 


CiYij ia Oj + Byxij t Eijs Eig S NID(0, o2). (7) 


Calibration curves are then determined by solving 


ay >> Yijci — (aj + Bizy), (8) 


Q, ß,c 


j=li=1 
where ¢ = (c1,..., cN)". The resulting ‘corrected’ residual sum of squares 
is 
q N z 
Soret = Iy Yijĉi = (a; T Bizy (9) 
j=1 i=1 


Separation into q independent residual sums of squares is not now possi- 
ble as the presence of the corrections c results in interaction effects among 
the terms in the outer summation. Estimation of model parameters is 
achieved by solving a system of m = qN linear equations in n = 2q + N 
unknowns a, 8 and c to deliver &, 3 and @ = (G,...,én)". 

There are only 2q + N — 1 mutually independent unknowns since it 
is necessary to impose a relationship on c (below). For a unique least- 
squares solution to be possible, there must be at least as many equations 
as independent unknowns, i.e., qN > 2q +N —1. For the system to be 
over-determined, qN > 2q + N. In terms of designing the experiment, 
it is generally desirable to maximize the degree of over-determinacy, i.e., 


59 


for qN >> 2q + N, since doing so reduces the influence of the random effects 
on the solution. 

The problem (8) is, however, ill-posed, since it has the trivial and phys- 
ically meaningless solution @ = 6 = C= 0, giving Scorected = 0. To 
obtain a valid solution postulate that the corrections c are realizations of 
a discrete random variable C with mean unity and probability mass func- 
tion è P(C = ci) =1/N. Then, denoting expectation by E, 


N N 
1 = E(C) =) aP(C =c) =X G/N. 


Thus, the following ‘resolving’ constraint is imposed on the solution: 


N 
X ci N, (10) 
i=1 


2.3. Use of the calibration curves 


Given area values y;, the inverse use of the calibration curves yields esti- 


mates of the unknown molar fractions T3: 


gau a (11) 


3. Numerical methods 
Express the measurements of the jth gas component as 
uele e FS le 
and let 
aE E T a 


denote the complete set of qN measurements. Let the calibration parame- 
ters be expressed as 


b= [bi,..., b3], bj = (aj, By". 


For the linear regression model (5) with uncoupled pairs b;, j = 1,...,4q, 
of calibration parameters, 


Xb~y or Xb) S Yj, J= lyg; 
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where 
l zij 
X =diag{Xj}{, Xj = 
l TN} 
The least-squares solution is 
b= (X*X) A'a or b; = (XP Xj) OX; z, fre eee. 


For a general linear regression model Xb ~ y, where X is an m x n design 
matrix, m > n and X has rank n, the solution can be obtained in a nu- 
merically stable way by using an orthogonal factorization X = QR of X, 
where Q is orthogonal and R upper-triangular. 1° R, the Cholesky factor 
of X, satisfies R'R = XTX. b is then obtained by solving the triangular 
system Rb = QTy.* This approach is used for all least-squares problems 
considered here. 

For the linear regression model (7) with coupled pairs of calibration 
parameters, 


|X —S|b' ~0, (12) 
where 
S = (diag(y,),...,diag(y,)), b = (b*, eT). 


The resolving constraint (10) can be incorporated through a Lagrange 


multiplier or, directly, by expressing cy in terms of c),...,cn_— 1. Using the 
latter approach, the system (12) becomes 

Xx, Ty zy 

_s! 
Xb xz, Aee y n=] Ñ | z=|: |, 
os oe YNjEN-1 

Xa Ta Zq 

with S’; = diag(y1;,--.,yn—1,3), 2; = [0,...,0, Nynj]? and e, a column 


vector of r ones. This system has a unique least-squares solution Ò that 
would be obtained as above by the QR decomposition of X. 

X has block-angular structure 56, advantage of which can be taken for 
a numerically efficient solution. The size of typical problems in the gas 
standards area is such that it is unnecessary to account for such structure. 


“The solution is straightforwardly implemented in a system providing appropriate lin- 
ear algebra support. For instance, in MATLAB 1!, [Q, R] = qr(X), bhatprime = 
R\ (Q’*y). 


61 


In other areas where sets of coupled calibration curves are to be used and 
there are many measurements to be processed, savings in computer time 
and memory would be possible. 


4. Uncertainties 


The uncertainties associated with the use of uncorrected and corrected mea- 
surements in obtaining the calibration curves, and with the use of the cal- 
ibration curves so obtained, are considered. 


4.1. Uncorrected measurements 


For the uncorrected measurements, the uncertainty matrix (covariance ma- 
trix) associated with bis V(b) = o? UUT, where U, an upper-triangular ma- 
trix, is the inverse of R (formed by solving the triangular system RU = J, 
with J the identity matrix). The value of ø would either be provided a 
priori as the measurement standard uncertainty o, associated with the ele- 
ments of y, or (if there is no lack of conformity of the model and the data) 
estimated by the root-mean-square residual (S/v)!/? = ||y — Xbljo/v"/2, 
where S is given by expression (6) and v = m —n is the degrees of freedom. 


4.2. Corrected measurements 


For the corrected measurements, the uncertainty matrix V(b) would be 
formed as above, where o, denotes the standard uncertainty associated 
with CiYij, 1e., the corrected y;;. For a valid fit to the data, o, can be 
estimated by (Scorrectea /(m -n +1)}/ 2 where Scorrectea is given by expres- 
sion (9). The degrees of freedom is m — n + 1, rather than m — n, because 
of the use of the resolving constraint (10). 


4.3. Use of the calibration curves 


The standard uncertainties associated with the values z} determined from 
the inverse use of the calibration curve, i.e., through expression (11) are 
evaluated by the straightforward application of the law of propagation of 
uncertainty %7: 


(a3) = z u) +P) +P) + lE, A, 03) 


where u(G@j, ĝ;) is the covariance associated with @ and B;. uly?) is 
taken as o,, and u*(@;), u2(@j) and u(@;, 8;) are taken as the appropriate 


~t 


elements of the uncertainty matrix V(b) or V(b). 
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The main contribution to u(x) is typically from the term involv- 
ing u(y;), especially if many standards are used in the calibration (i.e., N is 
large). In this case, u(x}) © u(y7)/|b|. When working with corrected mea- 
surements, additional terms should strictly be included in expression (13) 
that result from correlations between the curve parameters and the correc- 
tion factors. The corresponding covariances are available as elements of the 
uncertainty matrix V(b); these too are expected to be small. 


5. Results 


Results are given for two applications of the method, to the measurement 
of five and then seven minor gas components. 


5.1. Five minor gas components (26 standards) 


The numerical method described in section 3 was applied to measurements 
of q = 5 minor gas components using N = 26 standards. Tests for normality 
of the residuals corresponding to uncorrected and corrected measurements 
(cf. section 2) were carried out to check the validity of the distributional 
assumption made in section 2.2. The Kolmogorov-Smirnov test ê was used 
for this purpose, under the null hypothesis that the underlying distribution 
is normal. The tests provided no reason to reject the hypothesis. 

Tables 1, 2 and 3 show the results obtained. Table 1 gives the residual 
sums of squares for straight-line calibration curves for the main components 
of the standards without and with correction. Some highly significant im- 
provements are observed. In particular, the residual sums of squares for 
the first components, nitrogen and methane (table 1), reduced by factors 
of 3 x 10° and 1 x 10°, respectively. The corresponding root-mean-square 
residuals reduced by factors of 5 x 10! and 1 x 10%. The standard uncer- 
tainties for the calculated molar fractions (table 2) reduced by the same 
factors (to the one significant digit quoted), as might be expected in the 
context of regression. 

Table 3 gives the 26 estimated correction factors. They are all of the 
order of unity as expected. Expressed as fractional changes in the response 
values, they range from —8 % to +6 %. In this example, the range of 
sampling conditions was artificially enlarged to demonstrate the method. 

Figures 6 and 7 depict the results obtained for the first two of these 
components, nitrogen and methane. In each figure the uncorrected data 
and the (independently constructed) calibration curves are shown to the 
left, and the corrected data and the coupled calibration curves to the right. 


63 


Table 1. Residual sums of squares for straight-line calibration 
curves with and without correction for measurements of q = 5 
gas components using N = 26 standards. 


Methane 


CO2 

Ethane 

Propane 

Total RSS 1.0 x 10 


Table 2. Inverse use of the calibration curves obtained with and without correction 
factors. 





Calculated molar Standard 
Results Responses fraction 2* uncertainty u(z*) 
Uncorrected Nitrogen 5.129 8 x 10 19.2706 
Corrected 20.2408 
Uncorrected Methane 1.8753x10® 91.3191 
Corrected 90.7668 





Table 3. The correction factors corresponding to the results in table 1. 


1.049 1.056 1.046 0.966 1.044 0.976 1.022 0.958 0.958 
0.979 1.036 0.977 0.976 0.915 0.968 0.968 0.973 1.021 
1.030 0.947 0.990 1.020 1.030 1.026 1.014 1.041 













































5.2. Seven minor gas components (26 standards) 


Table 4 gives a further instance of the residual sums of squares for straight- 
line calibration curves without and with correction. The measurement re- 
sults corresponded to seven minor gas components and 26 standards. Again 
the Kolmogorov-Smirnov test gave no reason to doubt that the underlying 
distribution is normal. Although a substantial improvement was again ob- 
served, it was not as large as for the first example, implying that the sys- 
tematic effects were not so dominant at the lower concentrations involved 
in this case. 


6. Extensions 


Two extensions of the approach are briefly indicated, one to higher-order 
models and the other to different uncertainty structures. 


6.1. Higher-order models 


The approach extends to quadratic (or higher-order) regression models 
(that are linear in the parameters) that account for possibly non-linear 


on 


is) 


Nitrogen Corrected Area 
h 


10 15 20 10 5 20 
Nitrogen Gravimetric Concentration Nitrogen Gravimetric Concentration 


Figure 6. Experimental (uncorrected) data and calibration curve for nitrogen and the 
corresponding corrected results (right). 


Area Methane 
b 


= 
in 








90 95 100 “55 60 65 70 75 80 85 90 95 100 


55 60 65 70 75 BO 
Methane Gravimetric Concentration Methane Gravimetric Concentration 


Figure 7. As figure 6 but for methane. 


Table 4. The residual sums of squares for straight-line cali- 
bration curves with and without correction for measurements 
of q = 7 gas components using N = 26 standards. 


Gas components Residual sum of squares (RSS) 
Experimental results 

Propane 6.6 x 10 

i-Butane 1.1 x 1019 

n-Butane 1.7 x 101° 


neo-Pentane 4.3 x 108 
i-Pentane 8.1 x 108 
n-Pentane 6.2 x 108 
n-Hexane 7.3 x 108 
Total RSS 7.0 x 10 
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behaviour of the detector: 


When the resulting calibration curves are used to obtain estimates of 
unknown molar fractions, it is generally necessary to use some form of ‘in- 
verse interpolation’. Approaches for such computations and the evaluation 
of the associated uncertainties are available ”. 


6.2. Other uncertainty structures 


If the measurement data has different uncertainties associated with the 
measurement results, suitable weights can be introduced to accommodate 
the influence on the model parameter estimates. Define the weight matrix 


W = diag(w;,..., wz), wj = (W1j,---,WN;)'. 


In the absence of systematic effects the regression model (5) becomes 


1/2 1/2; a 
Wij Yij = Wiz (Aj + Bj xiz) + ay, 


with wi; = 1/u?(yi;), giving 
WxXb~Wy 


as the weighted counterpart of the unweighted system X b œ y. The solution 
can be effected using QR decomposition as before, except that the matrix 
to be factorized is WX rather than X and the ‘right-hand side’ is Wy 
rather than y. 

If there are systematic effects also, the regression model becomes 


1/2~ ips... S 
Wij CiYig = Wiz (Oj + By rig) + Eijs 


with wi; = 1/(G?u?(yi3)), giving 


Yi aj | bj Tij 


uly) Guy) G uly) 
resulting in a minimization problem that is non-linear in the parameters. 
This problem would be solved using an appropriate non-linear least-squares 
algorithm ° in conjunction with the resolving constraint (10). An approzi- 
mate linear model would be given by multiplying all terms in the model (14) 
by c; and, since c; is expected to be close to unity, replacing Ci; by Eijs 
yielding a counterpart of the approach for the homoscedastic case. 





ae Eijs (14) 
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7. Conclusions 


In a class of problems arising in the analysis of natural gases, the derived 
calibration curves that relate stimulus concentrations to indicated detector 
responses are mutually dependent as a consequence of changing environ- 
mental and instrumental conditions during measurement. A model was 
constructed that accounts for the systematic effects that give rise to this 
covariation. The method implemented to estimate the model parameters 
is least squares. It provides a solution that essentially removes these ef- 
fects. The result was a set of corrected measurements whose associated 
uncertainties are dominantly associated with random effects and are sig- 
nificantly smaller in magnitude. The required molar fractions can then be 
determined appreciably more accurately compared with their determina- 
tion without making such a correction. The numerical methods used to 
provide such solutions were outlined. 
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Consider approximating a set of discretely defined values f1, fo,...,fm say at 
£ = T1, £2,..., m, with a chosen approximating form. Given prior knowledge that 
noise is present and that some might be outliers, a standard least squares approach 
based on an l2 norm of the approximation error € may well provide poor estimates. 
We instead consider a least squares approach based on a modified measure taking 
the form € = € (1 + ee)" a4 where c is a constant to be fixed. Given a prior 
estimate of the likely standard deviation of the noise in the data, it is possible 
to determine a value of c such that the estimator behaves like a robust estimator 
when outliers are present but like a least squares estimator otherwise. We describe 
algorithms for computing the parameter estimates based on an iteratively weighted 
linear least squares scheme, the Gauss-Newton algorithm for nonlinear least squares 
problems and the Newton algorithm for function minimization. We illustrate their 
behaviour on approximation with polynomial and radial basis functions and in an 
application in co-ordinate metrology. 


1. Introduction 


Approximation is concerned with fitting a function F'(x,a) depending on 
parameters a to a set of data (z;, fi), i = 1,...,m, in such a way as to 
minimize over a some measure of the error of fit. The form F is often 
specified as a linear combination 


F(z,a) = X aj¢;(2), 
j=l 


“Work partially funded under EU SofTools_MetroNet Contract N. G6RT-CT-2001-05061 
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of basis functions ¢;(x). For example, we can consider a polynomial basis 
where ¢;(z) = 2/~+, or $;(x) = Tj-1(x) (Chebyshev basis ) or a radial 
basis function (RBF) with ¢;(x) = ¢(||x — Aj||), where @ is a univariate 
function and || e || denotes the Euclidean norm. (Thus, ||x — A,|| is simply 
the distance from x to the centre Aj.) 

It is quite common that noise of more than one type should be present 
in the data. We are particularly interested in situations where the majority 
of the data arising from a physical system behaves as predicted by the 
statistical model, but a minority (no more than 25%) has been corrupted 
by unknown and potentially large influence factors. The latter may not be 
easy to identify and we look for approximation methods that are robust 
with respect to such effects. 

The quality of the approximation F to f is assessed by a measure, such 
as a norm || f — F || of the approximation error { f; — F (x;,a)}. The most 
commonly adopted norm is the l2 or least squares norm 


1 


2 


|| £- F |lo= 5 e~ F (ena)? 


The l2 norm is appropriate when the noise in the data is drawn from a 
normal distribution and it can be shown that a least squares approach 
is optimal for this case. However, if the data has wild points, i.e., large 
isolated errors, a least squares approach is likely to provide a poor fit. In 
this situation, a better norm to adopt is the lı norm, 
m 
|£- F l= > Ifi- F (z;,a)|, 
i=l 

which in general is less affected by the presence of wild points. Although 
lı approximation can be considered quite effective!’, we give three reasons 
for looking for other approaches. Firstly, while there exist satisfactory lı 
approximation algorithms for linear models, the algorithms for nonlinear 
models’? are much more difficult to implement. Secondly, lı approxima- 
tion does not enjoy the same (or equivalent) optimal behaviour as that of 
lo approximation. The price of being able to deal with wild points is that 
somewhat poorer use is made of the valid data points. Thirdly, there is no 
satisfactory method of estimating the uncertainty associated with the fitted 
parameters for lų approximation. By comparison, the lz case is straight- 
forward and well-rooted in probability theory. Ideally we aim to find a 
compromise between these two norms in order to provide effective algo- 
rithms for data reflecting both normally distributed noise and wild points. 
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In section 2 we describe a new least squares approach — asymptotic 
least squares (ALS) — that combines least squares approximation with a 
nonlinear transformation function that offers, we believe, a straightforward 
and effective method. In section 3, we present a number of algorithms for 
ALS approximation based on standard nonlinear least squares and iterative 
weighting schemes. A number of example applications are presented in 
sections 4-6, and our concluding remarks are given in section 7. 


2. Asymptotic least squares 


Below, we let (zi, fi), i = 1,...,m, be a set of data points, F(z,a) an 
approximating function depending on parameters a = (a@1,...,@,)', and 
e; = €;(a) = fi — F(x;,a), the approximation error associated with the ¿th 
data point. Often, F represents a linear model so that F; = F(a2,;,a) = 
ca for some vector of coefficients c; = (cj1,...,Cin)*. If F is a linear 
combination of basis functions then cj; = ¢;(2;). In matrix notation we 
write F = Ca where F = (Fi,...,Fm)'. 

The least-squares best fit function is the one specified by the parameters 


a that minimize 
m 


E(a) = : S e(a). 
i=1 

(The fraction 1/2 is included to simplify later expressions.) The fact that 
the contribution to the objective function Ez associated with each data 
point is the square of the approximation error means that a least squares 
solution will try to accommodate wild points to the detriment of the quality 
of fit at the remaining points. In l, approximation, the contribution of each 
point to the objective function is the approximation error itself. This allows 
an lı approximant to tolerate large approximation errors associated with 
wild points. 

The main idea of asymptotic least squares (ALS) is to apply a transfor- 
mation function to the approximation error e; that will reduce the effect of 
large errors associated with wild points. We look to minimize an objective 
function of the form 


5 EAA g 

E(a) ak 9 28a) = Te), (1) 
for some suitable transformation 7. We require i) 7 to have continuous 
second derivatives so that minimizing E is a smooth optimization problem, 
ii) 7(0) = 0, 7’(0) = 1 and 7”(0) = 0 so that for small errors, E has 
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similar behaviour to a standard least squares objective function, and iii) 
lim)<| co T (€) = 0 so that increasing an already large error will have a 
marginal effect on E. A simple function satisfying these criteria is 


7(x) = z/(1 + ex")? (2) 


We note that lime—+oo T(€) = +1/c. There are various other choices that 
may be made for T(e), with similar properties to (2), e.g., r(e) = tanh(ce). 





Figure 1. Graph of r defined in (2) for different values of c. 


The parameter c in (2) controls the level of e at which the transform 
takes effect (figure 1). If we expect the (standard) noise in the data to be 
described by a normal distribution with mean zero and standard deviation 
unity, then we would expect approximately 95% of the approximation errors 
to lie in the interval [-2,2]. In this region, we want 7 to make a small change, 
suggesting a value of c in the region of c = 1/4. 

Given such a 7 satisfying the criteria above, we can define related trans- 
form functions of the form 


zt, [e| <d, 
Tr(x)= < r(x—-d)+d, z>d, (3) 
T(x+d)-d, z<d, 
that also satisfy the criteria. The parameter d ensures that € is the same 
as lg for errors in [—d, d]. 
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The definition of 7 in (3) can be compared with a Huber estimation 
approach® which minimizes 


Ey(a) = D p(e,(a)) 


where 
x, if |z| < d, 
p= P if je| > d. 
For small residuals (data generated by a system obeying the statistical 
model), the Huber estimator behaves like a least squares estimator. For 
large residuals (outliers) the estimator is like £4. 


If the noise in the data is drawn from a distribution with standard 
deviation o then an appropriate form of T is 


r(x) = 2/(1 + cx? /o?)}/? or r(x) = (x/o)/(1 + e(z o)". (4) 


In terms of determining the parameters a these two forms are equivalent 
but the latter formulation gives a more statistically relevant form. 


3. Algorithms for asymptotic least squares 


Even if F is linear in the parameters a the introduction of the nonlinear T 
function makes the minimization of E a nonlinear least squares problem. 
In this section we consider general approaches to its solution. 


3.1. Nonlinear least squares optimization 


Let us first consider the problem of minimizing a general (smooth) function 
E(a) of n variables*. Let g(a) be the gradient of E, g; = OE /0a;, and H 
the Hessian matrix of second partial derivatives, Hj; = 0°E/0a,;0a,. In 
the Newton algorithm, an estimate of the solution a is updated according 
to a := a + tp, where p solves Hp = ~g and t is a step length chosen to 
ensure a sufficient decrease in E. Near the solution, the Newton algorithm 
converges quadratically. Away form the solution, there is no guarantee that 
the Hessian matrix will be strictly positive definite and algorithms have to 
deal appropriately with this eventuality‘. 

Suppose now £ is a sum of squares function E(a) = $ 7)", e?(a) and 
let J(a) be the Jacobian matrix J = 0e;/0a;. Then g = JTe and H = 
JEJ +G, where 
> ao 07e; 

0a;0ax 


i=l 
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The Gauss-Newton (GN) algorithm follows the same approach as the New- 
ton algorithm, only that in determining the update step, H is approximated 
by JTJ, i.e, the term G is ignored and p is found by solving the linear least 
squares problem J! Jp = —JTe. A stable method of solving this system*° 
is to use an orthogonal factorization of the matrix J. The GN algorithm 
in general converges linearly at a rate that depends on the condition of 
the approximation problem, the size of the residual errors near the solution 
and the curvature. If the problem is well-conditioned, the residuals are 
small and the summand functions e; are nearly linear, then JTJ is a good 
approximation to the Hessian matrix H and convergence is fast. The GN 
algorithm has the advantage that if J is full rank, then JTJ is strictly pos- 
itive definite so that some of the complications associated with a Newton 
algorithm implementation are avoided. 

Uncertainty matrix associated with the fitted parameters. For stan- 
dard nonlinear least squares problems, the uncertainty matrix of the fitted 
parameters? is estimated by 


Ua = ê’ (J* J), 


where ¢ is an estimate of the standard deviation ø of the noise in the data. 
A posterior estimate is 6? = e'e/(m — n), where e the vector of residuals 
evaluated at the solution. If the Hessian matrix H is available, Ua = ô? H~! 
is generally a better estimate. 

Let us now return to the ALS problem of minimizing E(a) defined in 
(1). To employ a Newton-type algorithm we need to calculate 


g= 3% Jy=ag, h= Zee 
and 
H=SI7I+6, Gy = 450500; a 
ðajðak 
We note that 
0 E; x Oc; Oci . Oe; 7 d?r 





dada, da; Om, Oaoa = ea 


The first term on the right is the contribution due to the curvature in 7, the 
second, due to that in F. Even if the second term is small, the first term 
is likely to be significant. This means that in practice the Gauss-Newton 
algorithm implemented for ALS will have significantly slower convergence 
than a Newton algorithm. However, if F is linear with F = f — Ca, the 
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second term is zero and a Newton algorithm can be implemented easily 
with J and G calculated using the following identities: 


Jij = eiT Gik = Tria c: 


3.2. Iteratively weighted least squares 


In this section, we introduce two novel algorithms for tackling the nonlinear 
problem of minimizing F(a). Iterating over q, we minimize at step q, 


5 (v 1D (a) , wf = (1+ (ce?)?) SS tO ©) 


i==1 


ee 1 
to determine estimates a,,; and associated approximation errors (tt A 


Here, the weights w? are fixed and determined from the solution e0 at the 
qth iteration. Updated estimates of the parameters a are therefore found 
by solving a weighted least squares problem. In practice this algorithm 
converges at a rapid rate to a near-best l2 approximation. Since each step 
is a least squares problem the algorithm is straightforward to implement, 
especially for linear models. 

Having observed the simple but general form of (5), we are able, as an 
alternative, to replace this “multiplicative” iteration with a similar, but 
additive, iteration: 


> e GOE 20 _ fy". (6) 


This is very similar to an algorithm proposed by Mason and Upton’. If we 


əN 1/2 
write 6} = (1 F (cel ) ) = 1/w}, the form simplifies to 


m f BAe: ei z 
3 e (a) — (ce! ò eae fi am? \ 


Again it is observed that the algorithm converges rapidly in practice to a 
near-best approximation. 


3.3. Maximum likelihood approaches 


The effectiveness of the ALS approach depends on having a suitable es- 
timate of the standard noise in the data. In this section we describe an 
approach borrowed from maximum likelihood estimation?:® to provide such 


74 


an estimate. We consider the function r(x) = (x/o)/(1 + e?(2/a)?)!/?, 
where now c is fixed but o has to be determined. Writing 
I 
(o? + c2e?)!/2’ 
we can regard the ALS as a nonlinearly weighted least squares approach 


where the weights are parametrized by o. We determine estimates of o and 
a by solving the optimization problem 


min E 2 é?(a,o) — > log w; co} l (7) 


As in maximum likelihood estimation, the solution estimates of a and o 
are those that are the most likely to give rise to the data {(2;, f;)}. Ifo 
is known, then the objective function in (7) simplifies to a least squares 
objective function (1). 


~ 


Ei = Wii, Wi = wi(o) aa 


4. Example: polynomial approximation 


In this section, we discuss the behaviour of ALS methods on polynomial 
regression. The model function is linear and of the form 


F(z,a) = X ajġ;(2), 
j=l 


where @,; is a Chebyshev polynomial basis function of degree j—1. Figure 2 
gives an example of least squares (LS) and asymptotic least squares (ALS) 
fits to 26 data points with 5 outliers (i.e., approximately 20% outliers). The 
figure shows that the ALS fit follows the majority of the data while the LS 
fit is skewed by the wild points. 

We use Monte Carlo simulation to compare ALS with LS approximation. 
We generate sets of data as follows. We fix m the number of data points, 
mo < m the number of wild points, n the number of basis functions, 
o the standard deviation for standard noise and go > o the standard 
deviation associated with wild points. We form uniformly spaced abscissae 
X = (21,...,2m)* and corresponding observation matrix C with cj; = 
¢;(a;). Given a = (a1,...,@,)* and number N of Monte Carlo simulations, 
we perform the follow steps. 


I Set fo = Ca. The points (zi, fio) lie on the polynomial curve 
specified by parameters a. 
II For each q = 1,...,N, 
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Figure 2. Example LS and ALS polynomial fits to data with outliers. The ALS fit 
follows the majority of the data while the LS fit is skewed by the wild points. 


i Add standard noise: f} = fo +r, ri € N(0,07). 

ii Choose at random a subset J, C {1,...,m} of mo indices. 

iii Add large perturbations: for each i € Ig, f,(i) = £,(7) + ri,0, 
Ti O = N(0, 03). 

iv Find the least squares and asymptotic least squares fits to 
data set {(z;, fiq)} determining parameter estimates a, and 
Ag, respectively. 

v Calculate estimate of the standard deviation of the residuals: 


ô? = ele/(m—n), oF = €'é/(m—n), 


with e€ = fy — Ca,, € = 7(f, — Cag). 
vi Store estimates: A(g,1:n) = aj, A(q,1:n) =ãT, S(q) = Ôq 
and S = õų. 
From these simulations we can compare, for example, the sample standard 
deviations ug and uM £, g of the calculated parameter values {A(i, j) }& 
and {A(i, j)}%™,, respectively, with that estimated using the standard ap- 
proach described in 3.1. 
Table 1 gives the uncertainties for the case m = 51, mo = 5, n = 7, 
a = 0.0005, co = 0.01 and N = 5000. The ALS fits were determined using 
T(x) = z(1 + c*x?)—1/2 with c = (40)~!. The table presents the standard 
uncertainties associated with the fitted parameters calculated in a number 
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of ways: uzg — LS fit using ô, u? o ~ LS using o, u¥ E — LS Monte Carlo 
estimate, ti4rs — ALS using o and ae Hessian matrix, wats — 
ALS using o and Appa Hessian matrix, u% zs - ALS using o and exact 
Hessian matrix, u4/©% - ALS Monte Carlo estimate. The table indicates: 


i) For the LS fit, the estimates uzs using the posterior estimate ô 
are in good agreement with the Monte Carlo estimates uf/. This 
reflects the fact that for linear problems, the posterior estimate of 
g is reasonably reliable. 

ii) The estimates u? ș using the prior estimate o significantly underes- 
timate the variation. These estimates would be valid if there were 
no outliers. 

iii) For the ALS fit, there is a modest difference between the estimates 
ŭars using the approximate Hessian JT J and those uars using 
the exact Hessian H = JTJ + G. 

iv) The estimates using the oe estimate o are larger than the 
Monte Carlo estimate u4% while the estimate u® z ç using the prior 
estimate o is in reasonable agreement with the Monte Carlo esti- 
mates. The average value for ¢ is approximately 0.00074 compared 
with ø = 0.0005. 

v) The Monte Carlo estimates u%{ are only modestly larger than 
u? s, the uncertainties estimates we would expect if there were no 
outliers present. They are approximately a factor of five smaller 
than those for the LS fits. 


The results show firstly the effectiveness of ALS approximation relative to 
LS and secondly, the uncertainty estimates associated with the ALS fits are 
a reasonable guide. 

These results have been generated using a value of c based on the known 
standard deviation o associated with the standard noise. In practice, we 
may only have a very approximate estimate of o. We have also applied 
a maximum likelihood (ML) approach (section 3.3) to this type of data, 
using the LS estimates of a and posterior estimate ¢ as starting estimates. 
The average ML estimate of o is approximately 0.00057 and compares well 
with the assigned value of 0.0005. The ML estimates of a are close to 
those determined by ALS with the assigned value of o. While the ML 
approach requires general optimization algorithms rather than least squares 
algorithms, it may well be valuable in situations where little is known about 
the underlying noise. 
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Table 1. Estimates of the standard uncertainties of the fitted parame- 
ters: urg — LS fit using 6, u? ș — LS using 00, uMC — LS Monte Carlo 
estimate, i479 — ALS using ¢ and approximate Hessian matrix, uALsS — 
ALS using & and exact Hessian matrix, u9 zg — ALS using o and exact 
Hessian matrix, uł ©, — ALS Monte Carlo estimate. 


[tars uars aps | VALS | 
0.2616 0.1907 
0.2137 0.1565 
0.1976 0.1437 


0.1904 0.1383 
0.1766 0.1247 
0.1699 0.1217 
0.1668 0.1199 





5. Example: approximation with radial basis functions 


In this example we fit a radial basis surface to data. The radial basis 
function (RBF) has centres A; lying on a regular 5 x 5 grid, a subgrid of a 
21 x 21 grid of points x;. The observation matrix was defined as 


Ci; = exp{—|[xi — Aj||?/(207)}, 


associated with a Gaussian RBF. Given a, a vector of coefficients, the 
height fp of the surface at the grid points was calculated from fp = Ca. 
Figure 3 graphs the resulting surface. Random perturbations with standard 
deviation ø = 0.001 were added to fo to generate heights f with fi = fiotri, 
ri € N(0,07). To simulate wild points, we set fo = f and at 40 random 
locations 2 € Io C {1,...,441}, large perturbations were added fio = 
fi trio, rio € N(0,03), with oo = 0.020. In Figure 4 we graph the 
difference between the LS fit to the heights fo (with wild points) with that 
to heights f (with standard noise). Figure 5 plots the same information 
for the ALS fit. For the LS fit the wild points skew the fitted surface by 
up to 0.006 (much larger than o) while for the ALS fit the difference is 
smaller that 0.0004 (less than ¢0/2). The ALS approach is much better at 
approximating the underlying surface in the presence of wild points. 

We use this example to compare the convergence of three optimiza- 
tion algorithms, the iteratively reweighted algorithm (IW) discussed in sec- 
tion 3.2 based on equation (5) and the Gauss-Newton (GN) and Newton 
(N) algorithms discussed in section 3.1. Table 2 shows the norm of the 
step taken (change in the vector of parameter estimates) at each iteration. 
Both the IW and GN algorithms show linear convergence behaviour with 


similar rates of convergence. The Newton algorithm exhibits quadratic 
convergence. 
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Figure 3. Surface generated using a Gaussian radial basis function. 





Figure 4. Difference between the LS RBF approximation to data with wild points 
(mixed noise) and data with standard noise. 


6. Example: assessment of aspheric surfaces 


In determining the shape of high quality optical surfaces using measure- 
ments gathered by a coordinate measuring machine, care must be taken 
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Figure 5. Difference between the ALS RBF approximation to data with wild points 
(mixed noise) and data with standard noise. 


Table 2. Convergence behaviour of the IW, GN 
and N algorithms for approximation with radial ba- 
sis functions to data with wild points. For each al- 
gorithm the norm ||[pg|| = {|ag — aq; || of the change 
in parameter values at the qth iteration. Each algo- 
rithm starts with estimates ag, the solution to the 
standard least squares problem. 























9.9268e-003 
3.5497e-003 
1.5802e-004 
1.6233e-005 
2.4621e-006 
5.0043e-007 
1.0022e-007 
1.9966e-008 
3.9706e-009 
1.0339e-008 | 7.8927e-010 


11 | 2.5770e-009 
12 | 6.4177e-010 
to ensure that the optical surface is not damaged by the contacting probe. 


However, using a low-force probing scheme, the presence of particles of dust 
on the artefact’s surface introduces sporadic, large non-random effects into 


5.9427e-003 
8.5095e-004 
3.9609e-005 
2.8828e-008 
1.6620e-014 


5.8686e-003 
7.6633e-004 
1.6589e-004 
4.1534e-005 
1.0467e-005 
2.6355e-006 
6.6184e-007 
1.6578e-007 
4.1434e-008 
























= 
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the measurement data. Figure 6 shows the residuals associated with an 
ALS fit of a hyperboloid surface to measurements of an aspheric mirror, a 
component in an earth observation camera. The spikes are due to particles 
of dust on the mirror or on the spherical probe. It is judged that 9 of the 401 
measurements (i.e., approximately 2%) have been contaminated. Because 
the dust particles must necessarily have a positive diameter an asymmetric 
transform function T was used in which only large, positive approximation 
errors are transformed. The standard noise associated with the measure- 
ments is of the order of 0.000 2 mm while the diameter of the dust particles 
is of the order of 0.002 mm. The difference between the ALS fitted surface 
and that generated using a standard (nonlinear) approach was of the order 
of 0.000 4 mm, and is seen to be significant relative to the standard noise. 


x10° 
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Figure6. Residuals associated with an ALS fit of a hyperboloid surface to measurements 
of an aspheric mirror. The spikes are due to particles of dust on the mirror or on the 
spherical probe. The units for each axis are millimetres. 


7. Concluding remarks 


In this paper we have described a simple approach, asymptotic least squares 
(ALS), to approximation to data associated with mixed noise. The ap- 
proach is based on least squares and uses a transform function with appro- 
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priate asymptotic behaviour. The estimates of the fitted parameters are 
found by solving a nonlinear least squares problem and we have described 
solution algorithms based on standard optimization techniques and also it- 
eratively re-weighting approaches. These algorithms are easy to implement 
using standard least squares computational components. 

The ALS approach can be applied to any situation in which a standard 
least squares analysis is appropriate. We have demonstrated its behaviour 
in approximation with polynomials, radial basis functions and aspheric sur- 
faces and shown that the approach is effective in dealing with wild points. 
It is straightforward to generate uncertainty estimates associated with the 
fitted parameters but further work is required to improve and validate these 
approaches. It is possible that a maximum likelihood approach may prove 
useful. 
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This paper deals with a method for the calibration of optical sensors on CMMs 
that relies on the measurement of specific artefacts. The calibration process 
consists of establishing the global transfer function between the 2D data given 
in the 2D space of the sensor and the 3D coordinates of points in the CMM 
space. The identification of the transfer function parameters requires the 
measurement of geometrical artefacts. In the paper, we suggest a specific 
artefact, the facet sphere. Algorithms for parameter identification are thus 
developed, based on the extraction of interest points belonging to the artefact. 
An estimation of dispersions associated with the method highlights the effect of 
some configuration parameters of the measuring system such as the position in 
the 2D space. 


1. Introduction 


All measuring methods using Coordinate Measuring Machines (CMMs) 
equipped with various sensor configurations or various sensor orientations 
require algorithms for the registration of point sets independently acquired. 

The issue consists of identifying the transformations that allow expressing 
the coordinates of the different point sets obtained via different measurements in 
a unique coordinate system. The problem is denoted as the calibration, and 
concerns the choice of the identification method and the definition of associated 
algorithms. Most generally, the method of identification relies on the 
measurement of artefacts’. In addition, the map of the measurement 


” Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 


83 


uncertainties is established in the CMM workspace. Even though these problems 
are quite well controlled within the context of mechanical contact probes’, they 
remain uncontrolled when using optical sensors, such as a laser-plane sensor. 
Indeed, when measuring with mechanical probes, calibration methods in 3D are 
based on the measurement of a reference sphere that guarantees the registration 
of the different measurements within the CMM coordinate system. Uncertainties 
can be evaluated in the whole CMM workspace, and are supposed to be 
independent of the sensor orientation’. When using optical sensors, existing 
calibration methods generally allow the calibration of the sensor itself (optical 
calibration)’. 

This paper deals with a calibration method for CMMs equipped with a 
laser-plane sensor. For the acquisition system we consider a motorized indexing 
head PH10 from Renishaw (htt senishaw.com), which supports the 
sensor enabling the probe to be oriented to repeatable positions increasing its 
accessibility space. Note that this configuration is as yet seldom used. With such 
a measuring means, the principle of 3D point determination is triangulation. 
Therefore, the sensor consists of a transmitter and a receiver. The transmitter 
sends a thin laser beam whose width allows a large part of the surface object to 
be covered; the receiver, a CCD (Charged Couple Device) camera, visualizes 
the plane under incidence, with a given fixed triangulation angle. The CCD 
camera thus observes the result of the intersection between the plane laser and 
the object surface as a 2D image (Figure 1). 


AWWW 





laser beam emission | 


2D observation in the 
CCD space 





Figure 1. Measuring system and principle of acquisition. 
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The 3D sensor used in our developments is the system KLS51 from Kréon 
Technologies (http:y/Awww.kreon3D.com). Following the acquisition is the 
calibration process, which defines the relationship between the 2D data and the 
3D coordinates of points belonging to the object surface. The calibration 
process thus conditions the accuracy of point acquisition”. 

The paper is organized as follows. In section 2, we detail the calibration 
process we develop for laser-plane sensors located on CMMs. We show that the 
calibration process leads to a global transfer function that allows the 
transformation between the 2D data into 3D coordinates of points in the CMM 
workspace. The transfer function brings out parameters that must be identified. 
This point is exposed in section 3. The inverse problem is solved by the 
measurement of artefacts, the shapes of which are clearly defined and well 
known in the 3D workspace. However, some configuration parameters, such as 
the point position in the CCD space in the may affect measurements. The study 
of the most adapted artefact is thus conducted. Section 4 is dedicated to the 
development of the calibration algorithms for the chosen artefact, the facet 
sphere. In section 5, uncertainties associated with the method, in particular those 
linked to the variability of the parameters are estimated. This highlights the 
importance of the location in the 2D CCD space. The conclusion in section 6 
indicates advantages and disadvantages of the proposed method. 


2. Calibration process 


The calibration process is the main element of an acquisition system for it 
allows the link between the 2D coordinates, N(R, C), (R: row, C: column), in 
the CCD space to the 3D ones M(X, Y, Z) to be expressed. 


N(R,C) = 


X=fi(RO 
Y = f(R,C 


Z = f(RO 
2D CCD space ( 3D space 





Figure 2. Calibration process: transformation of the 2D data into 3D coordinates. 


Following the definition of Tsai°, “camera calibration in the context of 3D 
machine vision is the process that determines the internal geometric and optical 
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camera characteristics (intrinsic parameters) and/or the 3D position and 
orientation of the camera frame relative to a certain world coordinate system 
(extrinsic parameters)’. Basically, the calibration process requires the definition 
of a calibration model, and procedures and algorithms to identify both extrinsic 
and intrinsic parameters’. The method we develop allows the identification of all 
parameters at the same time. Bases of the method that was first exposed by 
Dantan® are developed next. They rely on the geometric modelling of the 
acquisition system. 


2.1. Modelling of the acquisition system 


Modelling requires the definition of coordinate systems attached to the different 
elements of the acquisition system’. 





Figure 3. Relative positions and orientations of the different elements of the acquisition 
system. 


CMM coordinate system: Rm (Om, Xm, Ym, Zm) 
Rm is associated with the CMM; its axes are the machine translation axes, 
and its origin is Om, which is unknown. 

Reference coordinate system: Ro (Oo, Xm, Ym, Zm) 
This system is the absolute framework and is attached to a reference object 
located in the workspace of the CMM. The origin Op is a characteristic 
point of the object. 

CCD coordinate system: Recp 
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It is the coordinate system attached to the CCD camera, for which the axes 
R and C are parallel to the lines and columns of the CCD matrix. 
Coordinates of a point are referred to by (Ri, Ci). 


Camera coordinate system: Re (Oc, Xc, Yc, Zc) 
The origin O, is the projection centre. The axis Z, is merged into the optical 
axis. The axes Xc and Yc are the projections of (R,C) of the CDD matrix 
into the plane normal to Zc. 


Laser-plane coordinate system: Ria (Ota, Xia, Yias Zia) 
The origin Ow is the intersection of the optical axis of the lenses and the 
laser plane. The Z|, axis is normal to the laser plane and, Xia and Yia are the 
projections of the directions (R,C) into the laser plane. 


Let us consider a point M belonging to the object surface. The 3D 
coordinates (X, Y, Z) of M are expressed in the reference coordinate system Ro. 
M also belongs to the laser-plane space. Its coordinates in Rj, are (Xia, Yia, Zia): 
Therefore, we can write: 


OM = OpOn + OnP + POig + O~M (1) 


O,,P corresponds to the position of the sensor, which is identified by the CMM 
coordinates (reading of the measuring rules); O,,P = X, Xm + Y, Ym + Z, Zm. 
0,0 and PO,, are translation vectors, which are a priori unknown. Let us 
consider R, the rotation matrix between the bases B,,(Xm, Ym, Zm) and Bia(Xia, 
Ya Zia); each vector U is transformed as follows: U'™ = R, U®" with 


Thi Ty2 T13 
R =| r3 Ty r33 


r31 T32 133 
This leads to the following equation: 


X Tome t loa tA; ri i2 M3 |) Xia 
Y |= Tomy + Toiy PY, +| ra T22 r3 | Yia (2) 


Z Tomz + Totaz + Zs r31 T32 133 || Zia 


Eq. (2) brings out 6 parameters corresponding to the location of the sensor 
within the 3D space: 3 translations and 3 rotations. These parameters are 
generally called extrinsic parameters. However, most generally the rotation 
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movements are limited to 4 predefined positions of the sensor, considering the 
laser track parallel to one displacement axis (X-axis or Y-axis). Only the 
translation parameters have to be identified, which is simply carried out by the 
measurement of a sphere located within the workspace. 

To express the global transfer function that links the 2D coordinates (R, C) 
to the 3D coordinates (X, Y, Z), we have to consider the transformation between 
the laser-plane system and the CCD coordinate system. We thus need to define 
the geometric model of the CCD camera. 


2.2. Modelling the CCD Camera 


The modelling consists of expressing the geometric behaviour of the CCD 
camera in the process of image formation through parameters. The most used 
model of the camera is the pinhole model. The camera objective is assumed to 
be an ideal lens, which is supposed to be plane (denoted the image plane), and 
the axes of the image plane are supposed to be perpendicular’. This model 
corresponds to a central projection, for which the centre of projection is the 
optical centre. Therefore, a point is projected onto the image plane by means of 
the optical centre. The distance between the image plane and the optical centre 
is the focal distance f. 

Basically, the pinhole model leads to an analytical relation between the 
coordinates of the point expressed in the object space and those in the image 
plane: 


(R, C) = F(X, Yo; Z.) (3) 


It can be noticed that an image point does not correspond to a single point, 
but to a single beam in the object space. 

Parameters of the function F correspond to the intrinsic parameters 
associated with the CCD camera. This model can be integrated with effects 
linked to distortions taking into account various kinds of defects: geometrical 
quality of the lenses, misalignment of the lenses, .... This increases the number 
of parameters to be identified’. 


2.3. Modelling in relation with the laser plane 


The system that makes up the laser beam is composed of a light diode, a lens 
and a set of mirrors. The light diode is fixed on the optical axis. From a light 
source, the double lens will form a plane beam that focuses at a given distance. 
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The geometric model of the laser beam can be a plane, or a part of cone, or a 
polynomial surface. 


As for the model of the CCD camera, this leads to a relationship between 
the (R, C) coordinates and the (Xia, Yj, Zia) coordinates related to the laser- 
plane coordinates through the central projection. Note that the model must take 
into account the angle between the laser-plane and the optical axis °: 


(R, C) E G(Xn, Xis Zia) (3 ) 


For the 3D sensor used (KLSS1, from Kréon Technologies), the laser beam 
is assumed to be plane. Therefore, as a 3D point also belongs to the laser plane 
(see Figure 2), its coordinate Zi, satisfies Zia = 0 and Eq. (3) given by the 
constructor is'®: 


_ bj R+b,C+b; 
a d)R+d,C+1 
b,R+b6.C +b 
F peak, E asl A (4) 
d,R+d,C+1 
Zia =0 


Eq. (4) brings out 8 additional parameters. These intrinsic parameters are 
linked to the sensor architecture and correspond to the geometrical modelling of 
both the CCD camera and the laser beam, which is assumed to be plane. 
Basically, these parameters are identified independently, most generally by the 
sensor constructor. Salvi!’ provides a survey of various camera-calibrating 
methods, highlighting the role of the chosen model and the identification 
method in the measurement uncertainty. Indeed, the accuracy of the camera 
calibration is one of the most important factors that condition measurement 
quality. However, the fact that geometrical parameters of the sensor are 
identified independently from the displacement system increases the 
measurement uncertainty. 


2.4. Expression of the global transfer function of calibration 


Let us now go back to the definition of a global transfer function. Our objective 
is to merge Eq. (2) and Eq. (4) so that we can directly express the coordinates of 
a 3D point in function of the position of the point in the 2D CCD space (R, ©). 
This leads to: 
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+) Ty |+| 1X ja +122 Vig 
Z Zs} IT, r31 Xia +132 Yla 
biR +b,C +b, bR +b;C + bs 
rr aa O a r e 
dR+d,C+] ad;R+d,C+l 


x) 1 xe) Pe 

le aly late, BRtbhC+tb,  baR+bsC +e 

7 £ 7 á A ARIA Ci ” dR+d,C+1 
st LT, bR+DC+b;  baR+bsC +B, 


r31 32 


d,R+d,C+l d,R+d,C+1 


Regrouping in the previous equation terms in R and C yields 
X=X.4 aR+a,C+a, 
aoR+a,,C+1 


yay eee (5) 


agR+a,,C+1 


zega a Nec 
doR+aC+!i 


The global transfer function of calibration (Eq. (5)) expresses the 
coordinates (X, Y, Z) of a 3D point in the CMM-system as a function of the 
location (R, C) in the CCD space of the sensor. This expression brings out 11 
global parameters, which are representative of both the intrinsic, and the 
extrinsic parameters. The calibration of the geometrical parameters of the sensor 
and the calibration of the sensor location within the workspace are performed 
through the same identification process. Uncertainties are thus limited. All 
sensor orientations can be considered. 

The identification process relies on measurements of artefacts of specific 
shapes. The choice of the artefact shape is important for the efficiency and the 
robustness of the calibration method. 


3. Identification of calibration parameters 


The identification process entails the study of the geometry of the artefact, 
which has to be simple and reachable whatever the orientation of the sensor. 
This artefact becomes the reference element. The shape is conditioned by the 
following constraints: 


- geometric model, 
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- accessibility of the sensor. 

The most usual artefacts used in dimensional metrology are spheres and 
prismatic elements, for their geometry is simple. Furthermore, fitting algorithms 
of an ideal geometrical element to the data points are well controlled’’. 

However, in this case the problem is quite different. Our data are the 
coordinates (R;, C;) in the 2D space of an object which position is known in the 
3D space. Therefore, considering eq.5, for a given sensor orientation, the 
identification of the 11 parameters relies on the measurement of 11 points My, 
well known in the 3D space. In practice, it is difficult to isolate a point in the 
CCD space. As a measure corresponds to the intersection between the laser- 
plane and the surface object, the result in the 2D space is a digitising line (see 
Figure 4). 


digitising line 


(Ri,C;i) 


object geometry known in 3D 2D space 3D space 





Figure 4. Measure in the 2D space: the digitising line 


First, let us consider that the identification is performed through the 
measurement of a sphere. Let us suppose that another measuring means, a 
contact probe, has precisely measured the sphere. Let C be the centre of the 
sphere and R be its radius. 

Each point M,, in the 3D space, belonging to the sphere satisfies: 


(Xx — Cx)" + (Ye - Cy)? + (Zp — Cx)? = R’ 


Each point M, that has been measured also satisfies Eq. (5). That means 
that there exists at least one point N(R;, C;) corresponding to M, in the 2D 
aR.+a,C;+a a,R.+a;C.+a 
(X, T ey +(Y, s—J 2 
space: i z (6) 

a,R,+agC.+a 
ae e SR 


aR; + ay,C; +] 


In Eq. (6), the linearity between parameters is lost. This complicates the 
identification problem. However, measuring a large number of M, points (more 
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than 11) could solve the problem. Furthermore, we have to define a 
measurement protocol in order to be sure that all the M, points are sufficiently 
representative of the 3D geometry of the sphere. Note that, as the orientation of 
the sensor must be preserved during the whole calibration stage, only 
transformations that leave the laser plane invariant are thus possible. That means 
2 translations in the plane (Xia, Yia) and 1 rotation around Zy. 

Considering the previous remarks, the plane seems an interesting reference 
element, as it satisfies a linear equation. However, a single plane does not allow 
the calibration of all sensor orientations. This leads us to define a specific 
artefact that consists of various planes: the facet sphere (Figure 5). 


Sensor positio 


Nx 


digitising line L,’ 


digitising line Lg’ 


View in the CCD space 





Figure 5. The facet sphere: measurement of a 3D point. 


As we have to measure at least 11 points in 3D to identify the 11 
parameters, we choose to define interest points. These points are located on an 
edge of the artefact, and correspond to the intersection between two adjacent 
planes. In the 2D space these points result from the intersection between 2 
digitising lines. As they are seldom measured, they are defined by calculation. 
Algorithms we developed relative to the calibration process based on the facet 
sphere are presented in the next section. 


4. Calibration process using a specific artefact: the facet sphere 


The equation of calibration (Eq. (5)) brings out 11 parameters, a; (i = 1, ..., 
11) that are identified by the extraction of interest points measured on 3 adjacent 
facets. The facet sphere is geometrically well known and located within the 
workspace of the CMM (measured using a mechanical probe). All the plane 
faces have a form deviation less than 1 um. To be measured via the KLS51 
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sensor, all faces must be coated with a white powder, which unfortunately 
increases the digitising noise. The faces are numbered from 1 to 25, and the 
equation of the corresponding plane, in the CMM workspace, is 
a,x+B yy+7,;z=7,, with € [1, 25]. 


As previously indicated, the calibration process consists of the 
measurement of interest points, M,. These points are located on the artefact; 
My(X,, Yx, Zy) in the CMM space. They are observed in the 2D space; that 
means that there exists a point N,(R;,, Cx) corresponding to M,. However, as the 
point belongs to the intersection between two adjacent planes, N, is the 
intersection between two digitising lines, L,' and L,’, traces of the planes in the 
2D space (Figure 5). 

Therefore, M, satisfies, in the CMM space, the equation of each plane to 
which it belongs: 

aX, +L; Yk +X; =Y; (6.a) 

QX y + Doi Ba dn A = 75 (6.b) 

M; also satisfies the global transfer function and by substitution of (X;, Yx, 
Zx) given by Eq. (5) into Eq. (6.a) and (6.b), the following two equations that 
are linear relative to the unknown parameters a; are obtained: 


-K; =a KC +aoK jRy tag 7; +agxjCy +477 j;RpCy tagpj +457 jC, 
+a, P Ry +4;a; +a,a,C, +a,a ,R, 

K = ay K Cy + aig Kk Ry + 49% j + gx jCy +07 Xj RCy + a6 Bh; +457 Cy 
+a, P; Ry +434; +a,a C} + aa, Ry 

where K,=a@,X,+f8,Y,+7,Z,+7, Œ = j or p). Remember that the 


coordinates (X,, Y., Z,) corresponding to the linear position of the sensor within 
the CMM space are given by the measuring rules. 

Each acquisition of a point of interest leads to 2 linear equations, and the 
parameters are the solution of a linear system of 2N equations in 11 unknowns, 
where N is the number of points of interest, allowing the identification. For 
robustness reasons, the resolution is performed using a superabundant system of 
36 equations (that corresponds to 18 acquisitions of various points of interest). 
The choice of these 18 acquisitions results from a specific measurement 
protocol for the calibration. Indeed, as said previously, we must ensure that all 
points are distinct within the CCD space. 

The protocol is thus the following: let us choose 3 parallel sections, for 
instance (face 10, face 3), (face 10, face 2), and (face 10, face 1); for each 
section, we move the sensor following 2 heights and 3 sweepings within the 
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laser plane while preserving the sensor orientation. For each position, 
coordinates of the sensor are acquired and the point of interest is calculated in 
the CCD space by the intersection between the two digitising lines (Figure 6). 
The linear system is then solved (using the Least-Squares criterion) giving the 
value of the 11 parameters a; constituting the global transfer function f. 
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Figure 6. Acquisition of the points of interest in the 2D space. 


However, the points of interest result from the calculation of the 
intersection between 2 digitising lines. As the 2D data are largely noisy, we 
associate the best /east squares line fitting the data to carry out the intersection 
calculation. This method of calculation is itself a source of errors. Indeed, the 
association is performed on a portion of the digitised data for each line. This 
portion allows border effects to be limited, in particular those near the measured 
edge. Due to the digitising noise, another measurement gives a different set of 
data points. The equation of the line fitting the points is thus different from the 
previous one. Therefore, there is an uncertainty of the position of the interest 
point, which is of great influence on the identification of the global transfer 
function. This point is discussed in the next section. 


5. Uncertainty linked to the calibration method 


The digitising noise may come from various sources”’®: laser influence, 
characteristics of the employed optics, behaviour of the CCD camera, aspect of 
the object surface, ... This point is not discussed here, but, whatever its source, 
we suppose that the digitising noise exists, and is a factor influencing the 
accuracy of the calibration method. We can add to the influencing factors the 
anisotropic behaviour of the CCD matrix: the acquisition is strongly related to 
the observation location in the 2D space of the CCD camera. 
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As said previously, a point of interest results from the intersection of two 
lines, each one corresponding to the intersection between the laser plane and a 
specific face of the facet sphere. The accuracy of the identification of both lines 
depends on the digitising noise, and impacts the global transfer function. Indeed, 
a first calibration procedure relying on the 18 calculations of N, leads to the set 
of parameters (a;);. However, due to the digitising noise, a new calibration 
procedure of the system may provide another set of parameters (ai), and so on. 
Each set of parameters introduced in Eq. (5) leads to different values of (X, Y, 
Z), for the same location (R, C) in the 2D frame, and involves a dispersion on 
the location of the 3D point. 


X+AX=K,+ fp(RO)+ Af, 
Z+AZ=Z, + f3(R,C)+Afy 


To evaluate the dispersions due to the extraction of the points of interest 
and the impact (AX, AY, AZ) of such dispersions on the obtained 3D points, we 
develop a method relying on numerical simulations. 

A first calibration procedure is performed giving 18 acquisitions, which 
constitute the reference. From this first step, digitising noise must be evaluated. 
An indicator of noise 6 is calculated. Considering that a line is fitted to the noisy 
data according to the least-squares criterion, the standard deviation, o, of the 
deviations of the data relative to the line is calculated (Figure 7b). For each of 
the k acquisitions L (k varying from 1 to 18), we associate the noise ô, 
considering that 6, = oy (with j = 1 or 2 for each one of the lines). Note that 
only a subset of nd points is preserved in order to be free from inaccuracies due 
to edge effects (ends of the laser line or sharp corner) (Figure 7a). 


ds=0 





standard deviation 


ome 











Figure 7. Evaluation of the digitising noise for each acquisition. 
Then we carry out the numerical simulations that correspond to virtual 
calibration procedures”. The ný points are randomly scattered about the initial 
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line, according to the noise 8,. To each new point distribution a straight line is 
fitted using the least-square criterion. The identification of each of the N, points 
of interest is carried out by the calculation of the intersection of the least-squares 
simulated lines. For each acquisition k, we perform Nb simulations; this gives 
Nb points N; (k=1, ....18). By reporting the results in Eq. (5), we obtain various 
sets of the 11 parameters (a;),, for p = 1, ...., Nb. 

The impact of the digitising noise on the location of a 3D digitised point is 
calculated as follows. A grid of points is generated in the 2D CCD frame 
(Figure 8a). For each set of parameters (a;)), we perform the transformation of 
the 2D grid points into 3D points, using the global transfer function (5). This 
Operation is repeated for p = 1 to Nb. As a result, to each point of the 2D grid 
correspond Nb points, defining a zone. This zone corresponds to the dispersion 
of the acquisition of a 3D point, a dispersion that takes into account the effect of 
the digitising noise on the calibration procedure (Figure 8c). 





Figure 8. Simulation of the 3D space according to Eq. (7) — Dispersion of the 
location of a 3D point. 

The value of the dispersion can be quantified by the value of the equivalent 
disc area of the ellipse that fits the point set. The obtained 3D-uncertainty zone 
is a function of the acquisition location in the 2D space, which confirms the 
anisotropic behaviour of the CCD matrix. Results show an extension of the 
position uncertainty from 0.05 mm to 0.3 mm in peripheral zones of the laser 
plane seen by the CCD camera. 


6. Conclusion 


In the paper we proposed a method for the calibration of laser-plane sensor on 
CMMs through the definition of a global transfer function. The method allows 
the transformation of 2D data into 3D coordinates in the CMM system for all 
orientations of the sensor supported by the CMM. The transfer function brings 
out 11 parameters that are identified at the same time by measurement of a 
specific artefact, the facet sphere. Little work has yet been done in this direction. 
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However, as for any type of calibration method, the process is a source of 
dispersion. Indeed, due to the digitizing noise, there is variability on the 
parameters of the transfer function. The simulation we propose defines virtual 
calibrations, and leads to an estimation of the parameters variability. It is thus 
possible to evaluate the impact on the uncertainty of measured points by optical 
measuring system. This highlights that uncertainty with such measuring systems 
is important, and cannot allow precise measurements for dimensional 
metrology. 
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Statistical methodology coming from the mathematical branch of probability theory, and 
powerfully helped today by the alliance of computing science and other fields of 
mathematics gives ways of “making the data talk". The place of statistical methods in 
metrology and testing is presented. Emphasis is put on some difficulties encountered 
when wishing to apply statistical techniques in metrology: importance of the traditional 
mathematical assumptions of many statistical techniques, difference between a 
confidence interval and other intervals (coverage interval,...), difficulties caused by the 
differences in terminology and concepts used by metrologists and statisticians. At last 
information is given about standardization organizations involved in metrology and 
statistics. 


1. The place of Statistics 


A key role is played today by the production, collection, analysis, presentation 
and interpretation of data. But it is useless to collect data if it is not to be 
analyzed and interpreted with a view to enlighten human actions or to progress 
the knowledge of phenomena. In summary, the aim is to transform raw data, 
once collected, into usable, understandable and communicable information. 
Statistical methodology coming from the mathematical branch of probability 
theory, and powerfully helped today by the alliance of computing science and 
other fields of mathematics (linear algebra, graph theory, algorithmic...), gives 
ways of "making the data talk". 

Different perspectives may be used to see what today is the place of statistics. 
The first perspective "from facts to decision" is to place statistics in a classic 
process starting with facts and ending with actions and decisions. 

Such a process is described by the sequence: 

Events/facts -> Data -> Information -> Knowledge -> Goals -> Action plans -> 
Actions/decisions -> Events/facts -> ... 

Statistical techniques such as design of experiments are used in the "events/facts 
subprocess". Data collection and data organization techniques are used in the 
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"data" subprocess. Data analysis techniques are used in the "information" 
subprocess. Statistical decision theory techniques are used in the "action plans" 
and "actions/decisions" subprocesses. 

Another interesting perspective is to place statistics in the "information 
management cycle" inspired by Deming’s PDCA. 

The "plan" stage is about data planning. It is the domain of planification where 
statistics are used for the data design, including available data and data to be 
produced, and the data processing design. 

The "do" stage is about data production and management. It is the domain of 
data management where statistics are used for data collection, data checking, 
data organization and management. 

The "check" stage is about the production of information and knowledge. It is 
the domain of knowledge where statistics are used for data access, off line or on 
line data analysis and reporting. 

The "action" stage is about decision making. It is the domain of action and 
decision where statistics are used for decision rules and indicators. 

A third perspective is to use the life cycle of products (see [1]). The starting 
point and ending point of this cycle are respectively customer expectations and 
customer satisfaction. The megaphases of this cycle are conception, 
development, delivery, maturity and death. 

The concept of variation, and the need to interpret data, and filter attributable 
effects from "noise" is inherent in all phases of the product life cycle. Statistical 
methods are used in the collection and interpretation of data on customer needs. 
The use of the methods of experimental design is fundamental to the product 
and process development and design phase. Statistical methods for process 
analysis process control, and process improvement are needed to monitor and 
improve the process performance. Reliability methods and statistical models for 
condition monitoring and analysis of failure data are of importance for 
monitoring product and process during the operational lifetime. 

From this short presentation it appears that statistics is not only a set of 
techniques but is a methodology for collecting data, analysing data, 
transforming data into information and designing decision rules, which takes 
place in the product life cycle and in the information management cycle. More 
information can be found in [2]. 


2. Issues in Metrology 


Traditionally, in metrology related activities, statistical methods have been 
utilised and referred to in the description of measurement and test methods for 
properties of product and in sampling inspection. The need for application of 
statistical methods is, however, not limited to these activities. 
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Statistical methods are needed for assessing measurement uncertainty, for 
calibration, monitoring and improving the measurement processes at the 
producer's site, as well as being needed by the various agencies involved in 
testing, verifying conformance, and validating the producer's quality and 
environmental management systems. 

However, to improve the role statistics may play for metrology, it is important 
to identify issues on which cooperation between experts of these domains has to 
be reinforced. 


2.1. Terminology and concepts 


Using consistent concepts is of major importance. Without pertinent concepts 
the risk is to all agree if one says that the right angle boils at 90°. 

Two reference documents have to be used in the domains of metrology and 
Statistics. These documents are ISO standards: the VIM [3] for metrology and 
ISO 3534 [4] for statistics. Of course these two systems of concepts do not 
cover the same domain, their interaction is on concepts related to applied 
statistics in metrology and testing activities. 

Unfortunately, in this field of interaction differences exist. As a consequence, 
practical difficulties for laboratories come from these discrepancies, either on 
concepts themselves, either on the wording of the concepts’ definition. 

Two typical examples can be given. 

In the VIM the concept of error relates to the deviation between a data 
(measurement result, indication...) and a “true” value. In ISO 3534, the concept 
of (statistical) error, also called residual, relates to the deviation between a data 
and the (statistical) model. 

In the VIM the concept of repeatability relates to a situation where all conditions 
of measurement are the same. In ISO 3534, the concept of repeatability relates 
to the sources of variation which are not explicitly taken into account in the 
model and which therefore are included in the variance of the residual (the error 
term of the statistical model). 


2.2. Usual assumptions and robustness 


Another issue is the importance of the usual assumptions made when using a 
statistical model. In general, the two main assumptions are normality and 
mdependence. 

A statistical model has to be seen under two perspectives. The first one is the fit 
to the input data. This is a geometric perspective (how does the model fit the 
data?) in which no assumption on the data distribution (e.g. normality) is 
necessary, as long as no inference is made. 

The second perspective is related to the use of the model to infer conclusions to 
another data or situations. This is why assumptions have to be made on the data 
distribution. It is well known that the mathematical properties of the normal 
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(gaussian) distribution allowed for developing most of the models which are 
practically used. 

However, many "natural" phenomena are not "normal" (i.e. distributed 
according to the gaussian law). 

Working with low levels of pollutants implies to take into account asymmetry 
and truncation. Measuring the largest diameters of a set of rods implies to work 
with extreme values distribution. Analysing failure occurrences imply to use 
statistics of rare events. Reliability tests imply to work with censored data. 
Today, efficient solutions for these "non normal" situations are given by 
numerical simulation techniques. These techniques, such as Monte-Carlo 
simulation or bootstrap, allow taking into account the "true" distribution of 
observed data. In this respect, however, the quality of data deserves careful 
thinking (Garbage In Garbage Out). 

The second assumption used by statistical models is independence. Actually, 
non correlation is often a sufficient assumption as long as second order 
moments (variances) are considered. 

Two kinds of independence have to be considered. The first kind is 
independence between the random variables, present on the right side of the 
equation, which represent the explanatory variables (the so-called “independent 
variables”) and the residual. The second kind is the independence between the 
random variables representing repeated measurement. 

The question of testing or checking for this assumption is still very difficult to 
address. One of the reasons why is that different concepts are mixed: stochastic 
independence for the statistical model, statistical correlation generally used as 
the tool for evaluating non correlation and “substantive correlation” as evaluated 
from prior knowledge, not directly related to the data used, for evaluating the 
statistical model. This question may be considered as remaining open. 


2.3. p values 


The users’ meanings of the null hypothesis of significance tests cover a very 
wide range. Users have also major difficulties for interpreting p-values. Apart 
from everyone’s experience, these have been reported in different publications 
(see, e.g. [5]). 

Actually, a major confusion exists between statistical significance, expressed in 
terms of p-values and “substantive significance” related to the expert knowledge 
of the studied phenomena. This should be improved by using more statistical 
thinking and less statistical rituals. 

A consequence of these confusions and of the difficulties in verifying the model 
assumptions is the poor confidence that can be put in the traditional values of 
the coverage factor k and the coverage probability p in measurement 
uncertainty. 

These traditional couples of values for (k, p) of (2, 0.95) and (3, 0.99) have to 
be considered only as conventional values (see, e.g. [6]). Unless when 
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assumptions can be met or when a careful numerical simulation is performed, 
the value of p, as a function of k, can be estimated with a good level of 
“confidence”. Moreover, it would be preferable to derive a pertinent value of p 
from a risk analysis of the intended use of the uncertainty of measurement, and 
estimate a value of k from the value of p. 


2.4. Intervals 


Confusions and difficulties also occur with the different kinds of intervals that 
may be used. Apart from difficulties in understanding the differences between 
the concepts this is also caused because their names are formed from a limited 
number of words such as: interval, confidence and coverage. 

A coverage interval is defined (GUM §2.3.5) as an “interval ... that may be 
expected to encompass a large fraction of the distribution of values that could 
reasonably be attributed to the measurand...Note 1: the fraction may be viewed 
as ... the level of confidence of the interval.” 

A confidence interval is defined (SO/CD 3534-1.3 §3.28) as an “interval 
estimator ... for the parameter...with the statistics ...as interval limits...Note 1: 
The confidence reflects the true proportion of cases that the confidence interval 
would contain the true parameter value in a long series of repeated random 
samples (3.6) under identical conditions. A confidence interval does not reflect 
the probability (4.6) that the observed interval contains the true value of the 
parameter (it either does or does not contain it).” 

A statistical coverage interval is defined (ISO/CD 3534-1.3 §3.26) as an 
“interval determined from a random sample (3.6) in such a way that one may 
have a specified level of confidence that the interval covers at least a specified 
proportion of the sampled population (3.1) ... Note: The confidence in this 
context is the long-run proportion of intervals constructed in this manner that 
will include at least the specified proportion of the sampled population.” 

It is important to notice that these concepts are different and that the whole 
system is consistent. 

This means that a coverage interval is not a confidence interval. A coverage 
interval does not estimate a parameter and the limiting values are not statistics 
(when type B evaluation is used). 

A coverage interval is not a statistical coverage interval either. 

It also has to be noticed that in the definitions of confidence interval and 
statistical coverage interval, the level of confidence is properly defined in a 
“frequentist way”. 

This is not the case in the definition of the coverage interval. It is necessary to 
go to other parts of the GUM to see that the level of confidence is rather a 
degree of belief. 

Therefore, what can be the users’ meaning of "confidence"? Efforts should 
certainly be undertaken for clarifying these concepts in the users mind. 
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2.5. Use of measurement uncertainty 


Laboratories devoted many resources in past years for estimating their 
uncertainties of measurement. These efforts have to be continued as 
measurement equipment and methods are evolving. However, today’s issues are 
also about the use of measurement uncertainty. 

Three main domains can be identified where methods describing how to use 
measurement uncertainty should be improved or developed. 

In conformity assessment (of a product) a decision is taken, from a sample 
(item), about the conformity of a population (lot). In related international 
standards two sources of variation are today taken into account in: sampling and 
variability of production. Measurement uncertainty, as representing the 
uncertainty of the measurement process, is generally not explicitly identified as 
a source of variation. When it is identified, it is assumed as being negligible 
compared to the other sources of variation. 

However, situations exist where this is not the case, e.g. for chemical analysis, it 
is therefore necessary to provide for methods taking into account all the relevant 
sources of variation. 

A second issue is the evaluation of the capability of equipment and processes. 
The capability characterises the degree to which equipment or processes respect 
specifications. This concept can also be applied to test and measurement 
methods. Multiple definitions of this concept, coming from different fields of 
activity, are today available. Measurement uncertainty should always be 
explicitly taken into account. 

A crucial question is the control of measurement uncertainty over time. In a 
typical evaluation of measurement uncertainty, time does not appear as a source 
of variation, as it is implicitly assumed that measurement uncertainty is not time 
dependent. When capability evaluation is considered it is necessary to consider 
the stability of the processes over time; therefore the dependency of 
measurement uncertainty over time has to be explicitly considered. 

The last issue is the evaluation of the proficiency of laboratories. Collaborative 
or interlaboratory studies have been proved to be a powerful tool for this 
purpose. A recent ISO standard, ISO/DIS 13528 [7], introduces the use of 
measurement uncertainty in proficiency evaluation. However, measurement 
uncertainty is not the only parameter to take into account. In this respect it is 
necessary that all the information used related to the quality of the information 
processed (laboratories’ results, reference values,...) be established on a 
consistent basis. 


2.6. Metrology and Testing 


It is important to understand that metrology and testing are domains which are 
not identical even if they largely overlap. 
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The purpose of a measurement is to evaluate a measurand; the purpose of a test 
is to evaluate a characteristic. 

A measurand (i.e. a particular quantity subject to measurement) is defined in 
consistency with the SI system of units and is independent of the measurement 
method, in the sense that a given quantity can be determined by using different 
methods of measurement. 

On the contrary, a characteristic do not necessarily refer to the SI system of 
units (e.g. hardness, viscosity, sweetness,... and other words ending in "-ess" 
and is defined in the context of a given test method. The viscosities determined 
by different test methods are not the same viscosities (unless the “correlation 
between test methods has been established). 

As a consequence the sub-concepts which are related to the measurand cannot 
be related to the characteristics but have to be related to the test methods. An 
example is given in the difference in the definitions of reproducibility in VIM 
and in ISO 3534. 

Moreover a characteristic do not necessarily takes its values in a continuous set 
(interval) of numerical values (real or complex, in dimension-1 or dimension-n). 
These are called ordinal characteristics. Examples are mark scales or ranks. 
There are also characteristics for which the possible values are not numerical 
and do not have an order structure. These are called nominal characteristics. 
Examples are pass/fail test results for which there are two possible values for the 
test result (pass of fail) or more than two values (if the cause of the fail is 
recorded) 

As a consequence pertinent statistical and data analysis techniques have to be 
used for these characteristics which do not pertain to traditional metrology. 


3. Resources: standardization committees 


The fundamental work of standardization organizations has to be pointed out. In 
a transversal domain such as metrology, efficient communication between 
experts of different fields (physics, chemistry, statistics, quality,...) is a 
necessary condition for progressing. International standards being documents on 
which a consensus has been reached it is therefore essential to improve the 
coordination between standardization organizations. 

The situation is certainly better today than it was in the past, as a first 
coordination is ensured by the participation of some experts to several of these 
committees. This is the case e.g. for ISO/TC69 (the ISO technical committee 
developing standards on the application of statistical methods) and JCGM 
(under which are undertaken the revision of VIM and the drafting of 
supplemental guides to the GUM). This participation improves the consistency 
between documents that are under development. 

However this operational coordination is not sufficient. It is now necessary to 
establish coordination at the strategic level in order to improve the coordination 
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of the business plans of these organizations. This should of course also include 
the activities of regional and national organizations. 

More information on the work of ISO/TC 69 and JCGM can be found on the 
following web sites: www.iso.ch and www.bipm.org. 
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In this paper the possibility of expressing the final result of a (any) 
measurement by a probability distribution over the set, discrete or continuous, 
of the possible values of the measurand is considered. 

After a brief review of the motivation, state of the art and perspective of this 
option, the related software tools are discussed, with reference to the package 
UNCERT developed at the authors’ laboratory. 

The software implements an approach based on the direct calculation of 
probability distributions, for a wide class of measurement models. It is arranged 
in a hierarchical and modular structure, which greatly enhances validation. 
Involved variables are considered as inherently discrete and related 
quantisation and truncation effects are carefully studied and kept under control. 


1 Introduction 


The possibility of expressing the final result of a (any) measurement by a 
probability distribution over the set, discrete or continuous, of the possible value 
of the measurand is a very attractive option for metrology. The reasons for that 
are manifold, concerning both the measurement evaluation (or restitution) 
process and the information communication, and may perhaps be summarised as 
follows: 

1. the expression of the final result of measurement by a probability 
distribution is a highly satisfactory format from a theoretical standpoint, at 
least as soon as probability is accepted, as usually is, as an appropriate 
logics for dealing with uncertain events/statements [1]; 

2. it allows calculation of any statistics needed for application (expected value, 
standard deviation, intervals...[2-4]) including those related to weaker, for 
example ordinal, measurement scales, which are of increasing theoretical 
and practical interest [5]; 


* Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
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3. the propagation of probability distributions may be a necessary approach to 
deal with those aspect of uncertainty evaluation “for which the GUM’ is 
insufficiently explicit and for which it is not clear whether its conventional 
(mainstream) interpretation is adequate” [6], so that this topic is presently 
under consideration by the Working Group 1 (WG1), Measurement 
Uncertainty, of the Joint Committee for Guides in Metrology (JCGM) [7]; 

4. expressing the results of measurement in terms of probability is essential 
when risk and cost assessment are involved, and, in general, when 
measurement is intended to support critical decisions: anyone sees how 
these topics are of high socio-economic impact and are strategic for the 
measurement community [8-10]. 


Let us now briefly consider the problems associated with this option. They are 
of both theoretical and practical nature and include: 

e the “meaning” of such a description (this is an essentially theoretical 
and even epistemological problem, but has also practical 
consequences); 

e the “model” that may be assumed for supporting such a description; 

e how to assign required probability distributions (those involved in the 
model); 

e the supporting metrology software. 

We are here mainly concerned with the last one, but we are convinced that 
any progress in this aspect will encourage pursuing this approach and will 
contribute to a progress in the theoretical debate. Software in support of a 
probabilistic expression may follow two main approaches, namely using 
stochastic simulation techniques [6] or performing direct calculation of involved 
probability distributions [11]. Although the first approach is at present more 
studied, we consider here the second one, since we are convinced of its validity. 
This includes the possibility of using it for checking the results obtained through 
simulation, through a back-to-back comparison strategy. 

So the article is organized as follows: the main theoretical background is 
first concisely presented, including an outline of the assumed reference 
probabilistic model and the derivation of two specific models of wide interest. 
Then the development and implementation of the package UNCERT is 
addressed, starting from the hierarchy of models considered, going through 
some essential implementation details and considering quality-assurance aspects 


GUM is an acronym for the “Guide to the expression of uncertainty in measurement” 
[12]; for the “mainstream GUM” method, see section 2.2 of this article. 
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also. Finally, a number of illustrative test cases are discussed, showing the 
potentials of the package, and conclusions are drawn. 


2 The probabilistic model 


2.1 The probabilistic reference model 


Modelling of measurement processes is a key topic of measurement theory 
[14], with several operational implications also [15-17]. Although we surely 
have no room here for even outlining it, we simply recall some general features 
of the model presented in Ref. [1], which we will use as our reference model in 
the following. 

Let us then consider a measurable quantity x, with values defined in a set X, 
and the class of measurement tasks consisting in the measurement of x, under 
the assumption it has a constant value for an interval of time 7, which contains 
the interval of time 7, required for entirely performing the measurement, 
including eventual repeated observations. 

We consider the overall measurement process as resulting from the 
concatenation of two sub-processes, named observation and restitution. This 
distinction is very tmportant and completely general, since the former accounts 
for transformations in the measuring system giving rise to the observable output, 
the latter includes data processing which yields the final measurement result. 
The output of the observation process is in general a vector y =| y,.... yy |eY¥ of 


observations. This formalism may represent measurement processes based on 
repeated observations from the same measuring system as well as indirect 
measurements, “in which the value of the measurand is obtained by 
measurement of other quantities functionally related to the measurand” [12]. 

We also introduce the vector of influence parameters 0, with values 


0 =|6,,...0, | in a K-fold space @ . Influence parameters are of two kinds: 


1. influence quantities, i.e. “quantities which are not the subject of the 
measurement, but which influence the value of the measurand or the 
indication of the measuring instrument” [12]; 

2. parameters concerning the involved probability distributions, typically 
dispersion parameters. 

Actually parameters of the second class may be thought as regarding a 
higher level of modelling, as a kind of hyper-parameters, but smce they may be 
mathematically treated in the same way as those in the first class, we prefer to 
avoid such a complication. 
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We also assume a probability measure to be defined over @, so that the 
probability distribution p(@) is defined. Then the observation process may be 


characterised by a parametrical probabilistic mapping from the set of values of 
the measurand X to the set of the observations Y, which is characterised by the 
parametrical conditional probability distribution: 


pa (y\x)= p(y\x,0) (3) 


where symbol ‘ =’ means ‘equal by definition’. 
In the special case of repeated observations, with independent variations, 
we have: 


p (3 1x)= [J Po (v: 1) (4) 


Let us now consider the restitution process. 
In the general case it is characterised by the transformation: 


p(x|y)= | pe wif noda] p(0)do= | p,(x.\y) da (5) 


This formula is the result of joint application of Bayes Theorem and of the 
principle of total probability (see Ref. [1] for details). 

It is worth noting that on the basis of this result, restitution may be 
interpreted as the probabilistic inversion of the mapping provided by the 
observation process, marginal with respect to the measurand. 

Now, restitution induces a probability measure on X and leads to the 
assignment of a probability distribution to x, namely p(x| y). As a result of the 
measurement process it is also possible to describe the measurand by a single 
value, the measurement value, given by: 

$ =E, (x| y) (6) 
where E, is the expectation operator, referred to the (random) variable x. 

Finally, the overall measurement process may be viewed as a probabilistic 
mapping from the set of the values of the measurand X to the same set of the 
measurement values X = X . Sucha mapping is characterised by the conditional 
probability density p(x|x) which may be implicitly expressed by [27]: 


P(X|x)a=] pix) (7a) 
where ADx is the region in the space of y where: 
ADi ={y: $< E, (x| y)< i+ az} (7b) 


The essential formulas of the model are summarised in Table 1. 
The model may be further generalised to accomplish for vector 
measurements, including dynamic measurements, and measurements based on 
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parameter estimation [1]. Anyway such generalisation will not be considered in 
this paper. 
Table 1. A summary of the key formulas of the probabilistic model. 


The Reference Probabilistic Model of the Measurement Process 
Process / Sub-process Probabilistic representation 


Observation (sub-process) Po (y |x ) =p (y |x,8 ) 


F 
Restitution (sub-process) p(x | y) = W2 (y x)| f Pe (y | x) de | p(@) de 


p(èlx)d = | | [ po (v1x) (8)a0 | dy 
AD ={y :X<E,(x|y)<%+ a} 


where: xe X : the measurand; y € Y : N-dimensional observation vector; 


Measurement (process) 





0 € ©: K-dimensional parameter vector; x € X : the measurement value. 


In the following sections some important models will be derived, as special 
cases of the above general model. 


2.2 The mainstream-GUM model 


The GUM [13] on the one hand presents general principles for the 
evaluation of measurement uncertainty, which are, as such, fairly independent 
from the assumed model, on the other hand develops a complete evaluation 
method, called in Ref. [6] “mainstream GUM”. We will also use this lexicon and 
also call “mainstream GUM model” the model implied in such a method. We 
want now to show its connection with our reference probabilistic model. 

Let us then consider the case the observable output is given, for each value 
of the measurand x, by the following random function: 

y=F(x)=f (xv) (8) 
where F is a random function, f is a deterministic function, v is a vector of n 
random variables, the first m mutually correlated, the remaining (n-m) 
uncorrelated. These random variables in general give rise to both observable 
(random) variations and to unobservable (systematic) deviations. 

Let us assume f to be linear or locally linearisable so that, at least 

approximately and apart from an inessential additive constant, we have: 
y= kx+a'v (9) 
where a is a column vector of known sensitivity coefficients (and ' is the 
transposition operator). Equation (9) may be trivially solved as: 
K=x-X2-c-a'v (10) 
where x is the measurement value, x is the correction to be (virtually) applied 
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to it to obtain the measurement value x, c=k”, X=cy or, in the case of 
repeated observations, x = cy . 

Moreover, considering that y is a special case of y and assuming the 
distributions of v to be defined up to a vector of dispersion parameters ô, we 
get: 

p(x ¥,5)= p, (x- 216), (11) 
which finally yields: 
p(xl¥)= f p(xl¥,6) p(d)dé (12) 


Equations (12) and (10) provide a probabilistic expression of the GUM 
Mainstream model. Additional details will be given in section 3.2, where 
numerical implementation is treated. We may notice that, in this model, the 
possibility of a deterministic inversion of the equation(s) defining the 
observation sub-process has been assumed, which is not the general case. 


2.3 A stochastic observation scheme 


The most general model at present implemented in the package (model 
UNCERT D, see section 3.3) is the following one: 
y, =O(kx+w,+sth) 


l 
s=a'y 
where: 
Q(.) = quantisation operator, with quantisation interval q; 
w, = Stationary series of uncorrelated normal random variables, with 
variance O° ; 
s = influence quantity resulting from the linear combination of random 
variables, as in the previous model; 
h = hysteresis term, characterised by p(h)= T (h-h, +59 (h+h,), 


where 0 is here the Dirac-delta operator. 

It is worth noting that, in this case, no deterministic inversion (solution) of 
the model is possible, due to the non-linear character of the quantisation 
operator, yet the probabilistic inversion is still possible. 

The observation process is characterised by: 


] +q/2 
P(yl x0,8)=>]] { | Py (y,-x-s—h, +§1o)+ p, (y,-x-sth,+§ lo) |dé 
t -q/2 
(14) 
Here the dependence on the vector of dispersion parameters 6 is implicit in the 
terms. Otherwise it may be made explicit as: 
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P(y|x,0,6)= [ P(v|x0.s) p(s16)ds (15) 
Restitution is then provided by: 


p(xoly)x | P(y |2.0.6) p(4) dô (16) 
Finally the required distribution may be obtained as the marginal distribution: 
p(x|y)= | p(xoly) do (17) 
Moreover, the distribution of o may also be obtained: 
p(oly)= | p(x. y) a (18) 


Again, details on the implementation will be given later on, in section 3.3. 


3 The Package UNCERT 


3.1 General features of the package 


The package has been developed with the aim of providing the metrologist 
with a set of tools giving full support in the expression of measurement results 
as a probability distribution over a suitable set of possible values of the 
measurand. Since direct calculation of involved distributions is performed, it is 
essential to consider an appropriate set of models to be treated. So the package is 
arranged in a hierarchical structure, starting from a set of basic modules, which 
may be extensively reused in the subsequent procedures, and progressively 
moving towards structures of increasing complexity. This philosophy has 
allowed a highly modular design. Parallel to the development of the 
programmes supporting such models, an implementation of the mainstream- 
GUM method has been done. This has a value in itself,, but in the general 
economy of the package it is mainly intended to provide a back-to-back testing 
utility. Table 2 presents the progressive development of the package. This step- 
by-step approach has greatly favoured the validation process. 

Another key implementation issue has been the choice of treating all 
involved values as discrete random variables. This choice has been motivated 
both on theoretical considerations and on implementation reasons, and related 
quantisation and truncation effects have been carefully studied and kept under 
control. Moreover, direct calculation of probability distributions essentially 
involves convolutions, in the case of linear relations, and weighted means 
among distributions, when non-linear, parametric dependency is to be accounted 
for. 
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Table 2. Steps in the development of package UNCERT 
UNCERT 0-2: UNCERT A-D: Calculation of 
the GUM mainstream method probability distributions 





Models 


Infinite dof, upcorrelate? Infinite dof, uncorrelated variables 

variables 
2 Finite dof, uuncorielaice i Finite dof, uncorrelated variables 
variables | 





2 Finite dof, correlated C | Finite dof; correlated variables 
variables 


Ta A (ea a LD 


In the following sections details will be presented on the implementation of 
the programmes forming the package. 


U 





Stochastic observation scheme 





3.2 Levels A through C 


Levels A, B and C, progressively developed, allowed the expression of the 
final result of a measurement by a probability distributions, under the same 
hypothesis assumed in the GUM Mainstream model, presented in section 2.2. 

In that model, it is possible to express the effect of all influence quantities 
on the final measurement result as a linear function of the variations in the 
quantities, i. e., neglecting an additive constant, as e = cy a,v, or, equivalently, 

i=l 
as x =—e, with known coefficients c and a,. For each of the input quantities, 
the following information is assumed to be available: 

a. either an estimated standard deviation G,, with v, degrees of freedom, 

which may be finite or infinite; 

b. or an estimate of the range of variability’, say [v,, — Av,,v,, + Av, |, with 


limit either certain or subject to a relative uncertainty, say 
a, = Adv, / Ay, ; 
Moreover, it is possible to identify a subset of mutually correlated input 
quantities, say, without loss of generality, v’=[v,,v,,...v,], O<msn, for 


which the corresponding correlation matrix is known R = A ; 
Under the assumed hypotheses, we have: p(x|x)=p,(x-%) where 


pP. (x) may be expressed as the composition, through a convolution rule, of the 


? We do not consider here, for simplicity, the case of a triangular distribution, which may 
however be treated in essentially a similar way as case b. 
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distribution accounting for the global influence of the subset of uncorrelated 
input quantities (if not null), P, snc (x), with the distribution accounting for 


the effect of complementary subset of mutually correlated input quantities (if not 
null), Py cor (x). Consequently: 


Pr (x) = Px uncor (x) * Pr corr (x) E I E (x = č) Px.cor (£) dé (19) 


and we have to evaluate separately these two probability distributions. 

For evaluating Pp, incor (x), it is necessary to assign a probability 
distribution to each of the variables, which may be done according to the rules 
summarised in Table 3. 


Table 3. Calculation of standard deviations and probability distributions 
for individual influence quantities 


Uncertainty on 
dispersion 
parameter 


Type 
of 
quantity 


Dispersion 
parameter 


Resulting standard | Probability 
deviation distribution 


Normal stanidarg Infinite dof cla, | 5, Normal 
deviation ©, 
Standard ae 
Normal devidtionc: Finite dof cla | O.V; / ( v,—2 ) t Student 


PENA 


Relative 
Uniform | Semi-range Av, | Uncertainty on CE 
semi-range @, 


For the calculation of P, (x) we consider a normal distribution resulting 
from the sum of n-m variables, with correlation matrix equal to R, and standard 
deviations to be calculated according to the rules in Table 3. 





3.3 Level D 


This level corresponds to the implementation of the stochastic observation 
scheme, described in section 2.3. 
The first implementation step requires calculation of P(y|x,0,s). 


Assuming a normal distribution for w, and defining the cumulative normal 
eo ee I e Ti T 
distribution as = a fe a d& and considering a quantisation of 


variables x,o,5 , with quantisation interval Ax , for each observed value y,, we 
have: 
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y,+4—ke-s-h, | , | y,-4-he-s-h, 
P ,0,5)=—@ -— Ø + 
lse 3 o 2 Oo 
(20) 
y, +1-kx-s+h, ] y, -t-kxe-s+h, 
+— Ø -— 
o Z o 

For the whole vector of observations y, we obtain: 

P(y x,o,s)=| |P(v, | x,0,8) (21) 

















Uncorrelated i P(y|x,0,$) l 
Quantities 
pao] TT 
[ea] 





[vm J | Pyixcs) | 









[Po Product | 


| Calculation of marginal 
distributions 
P(xly), Poly), u, U 


Figure 1. Flowchart of UNCERT D 









Now it is possible to calculate the joint distribution P(x,o,5| y): 


P(x,0,5 





y)=P(y| 1.0.8)P(s)P(a)P(s) > P(y|x, o.s)P(s}P(a)P(s)| 


4,0,8 


(22) 


where: 


P (x) is assumed to be constant over the set of the possible values of x; 
P(o) is either supplied by the user or assumed to be constant over the 


set of the possible values ofo ; 
P(s) is either supplied as output of UNCERT C Programme or 


assumed to be constant over the set of the possible values of s. 
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Finally the following marginal distributions may be calculated: 


P(x|y)= >) P(xo,51y) (23a) 
P(oly)=> P(x,0,5|¥) (23b) 


The flowchart of the algorithm is presented in Fig 1. 


3.4 Software quality assurance 


The quality assurance of software intended for use in metrology is a very 
important issue, subject of specific research effort [18-21]. It includes general 
quality requirements on the quality of software [22-23] as well as more specific 
metrological requirements [20-21]. Moreover, the package UNCERT has been 
developed in a research environment, which adds an additional important 
feature, since quality in research laboratories is an open problem, due to the 
specific character of research activities [24-26]. Let then us briefly address the 
following key points: 

e the development environment; 
e the implementation process; 
e the validation process. 

The development of the package has been realised in a research context, in 
parallel with the development of the related theory, mainly concerning the 
probabilistic modelling of measurement processes, and with studies on the 
evaluation of uncertainty in industrial environment, supported by dedicated 
experimentation. In the first stage, software has been intended mainly to support, 
through numerical examples, the theoretical studies. In a second stage, while the 
theory reached a first satisfactory systematization, a methodical implementation 
has been undertaken, according to quality assurance principles and the 
metrological requirements. The goal of this second phase is to propose the 
package as a standard tool for the expression of measurement results in 
probabilistic terms, by direct calculation of related probability distributions. The 
personnel involved in the development of the package have a strong scientific 
and metrological background and good programming experience. The 
development of the software has been done according to internal quality 
assurance procedure of the laboratory, which is undergoing ISO 9001 
certification. 

The implementation process has included the definition of the overall 
architecture of the package, based on the hierarchy of models detailed in section 
3.1, and the implementation of the individual modules, corresponding to the 
“levels” in Table 2. 
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The main steps in the implementation of the individual modules, according 


to the laboratory procedure for the development of proprietary software, has 
been the following: 


definition of the specification of the module; 

preparation of the code; 

planning of a set of tests (validation suite); 

application of the tests and eventual correction/improvement of the module; 

approval of the module and its formal inclusion in the software databases of 

the laboratory, on a dedicated area, located in the laboratory-internal- 
network server; from this moment onwards, all successive modifications 
must be approved and documented; 

each approved module is accompanied by: 

e the manual, which is at the same time a user’s manual, containing all 
necessary information for the correct use of the module, and a 
programmer’s manual, containing all the necessary details on the 
implementation of the module; 

e a detailed report on the validation campaign; 

e the input and output data set of all the validation tests performed; a part 
of such tests are intended to be repeated, each time a variation is 
applied, as a part of the variation-approval procedure; 

e occasionally, dedicated studies on special features of the module have 
been performed and reported; for example a careful study of the effects 
of quantisation and truncation has been produced. 


Finally, the validation procedures is based on the following main principles: 


l. 


since the programs works on discrete variables, distributions with a very 
small number of points (for example three points), have been used to check 
the functionality, by independent calculation of the expected results; 

on the opposite side, stress testing has been applied, concerning for example 
checking performance for large number of points per distribution (order of 
10°) and large number of variables (10'); 

back-to-back comparison with the programmes implementing the GUM- 
Mainstream method has been employed; 

reference data have been used, when possible, taken typically from the 
GUM or from other standards. 


Further validation is expected to take place, through the network 
SoftTools MetroNet. 
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4 Application Examples 


4.1 Calibration of an end-gauge 


Let us consider example H1 of the GUM, which concerns the measurement 
of standard reference blocks. Uncertainty evaluation is carried out after 
linearization of the measurement model, according to the GUM mainstream 
procedure, and Table 4 presents the input data. 


Table 4. Summary of the input data for the GUM H1 exampl 


i j es Probabili Dispersion Uncertainty on 
18 dof 
26 dof 
10% range 
50% range 

























Probability 

















Qo 40 0 40 80 
(b) 
Figure 2 Test case 1: probability distributions of the input quantities (a) 
and final distribution on a separate scale (b). 





4.2 Repeated observations 


Let us consider an example concerning repeated observations, from a low- 
resolution indicating device (adapted from [29]), considering two cases: 
A) the instrument is free from systematic deviations 
B) the instrument is assumed to have systematic deviations, bounded 
within +0.05 mm . Numerical values are collected in Table 5. 


Table 5 Numerical data for cases study. All data are expressed in millimetres. 
Testcase | q | As | y 
A Joi O75 TS TA ES IS TA 7S T5 TS TIA] 
Be ON | 0.05. | 7.5 47.5] 74] 75 | 252) 4] TS |75 |75 |74] 
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Since observations have a resolution limited to g, we will discretise all relevant 
quantities with Ax = q/10. 


The resulting discrete joint distribution P(x,,o,, | y ) is presented in Fig. 3. 
Beret rig eee Se 





Probability 





Figure 3 The graph of fae anen | y) 


The final measurement results, i. e. the density to be assigned to x and to o are 
shown in Figure 4. 
cs a | 0.3 








N 





Probability ., 
Probability 





Uo, 


(b) 
Figure 4. (a) Final result of measurement in both cases A and B. (b) Graph of P (c, | y) ; 
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4.3 Measuring device with hysteresis 


Let us consider now a case where some hysteresis is present in the 
measuring device. We consider two set of (simulated) data: in the first set the 
instrument is assumed to work on the +h, curve, in the second to randomly 


commute between the two curves. Results are presented in Fig.5. Now it is 
interesting to note that the probabilistic restitution algorithm is able to 
automatically recognise the two situations and give the correct results. In 
particular it emerges how the second case is more favourable, since data are 
unbiased, although there is a greater dispersion. 
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P Finale 


a 0.18 


Probability 
Probability 


(a) (b) 


Figure 5. Results from a measuring devices affected by hysteresis. 


5 Conclusions 


In this paper the possibility of expressing the final result of a (any) 
measurement by a probability distribution over the set, discrete or continuous, of 
the possible values of the measurand has been considered. 

In particular, a reference probabilistic model has been presented showing 
how it includes, as a special case, the model assumed in the GUM Mainstream 
method and also allows treatment of more sophisticated models, based on a 
stochastic observation scheme, including non linearities such as quantisation and 
hysteresis. Then the related software tools have been addressed, with reference 
to the package UNCERT, which has been developed with the aim of providing 
the metrologist with a set of tools giving full support in the expression of 
measurement results as a probability distribution. The package is organised on 
the basis of a hierarchy of models, which allows a highly modular design, and is 
based on direct calculation of involved probability distributions, treating all 
variable as discrete. It has been developed according to quality assurance criteria 
and validated by wide testing and is intended to be proposed as a standard tool 
for this kind of calculations. The expression of measurement results in 
probabilistic terms may be particularly beneficial whenever risk assessment or 
cost evaluation is required. 
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Information on possible values of a quantity can be gained in various ways and 
be expressed by a probability density function (PDF) for this quantity. Its ex- 
pectation value is then taken as the best estimate of the value and its standard 
deviation as the uncertainty associated with that value. A coverage interval can 
also be computed from that PDF. Information given by a small number n of 
values obtained from repeated measurements requires special treatment. The 
Guide to the Expression of Uncertainty in Measurement recommends in this 
case the ¢-distribution approach that is justified if one knows that the PDF for 
the measured quantity is a Gaussian. The bootstrap approach could be an alter- 
native. It does not require any information on the PDF and -based on the plug- 
in principle- can be used to estimate the reliability of any estimator. This paper 
studies the feasibility of the bootstrap approach for a small number of repeated 
measurements. Emphasis is laid on methods for a systematic comparison of t- 
distribution and bootstrap approach. To support this comparison, a fast algo- 
rithm has been developed for computing the total bootstrap and total median. 


1. Introduction 


The Guide to the Expression of Uncertainty in Measurement [1], briefly termed 
the GUM, distinguishes for historical reasons between the so-called Type A and 
Type B evaluation of the standard uncertainty. The first treats information 
gained from repeated measurements and the latter any other information. Both 
types of evaluation lead to a probability density function (PDF) for the consid- 
ered input quantity X. The expectation value x of that PDF is taken as best esti- 
mate for the value of X and its standard deviation as the uncertainty u(x) associ- 


” Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
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ated with that estimate. The PDF for the output quantity Y can be computed if 
one knows the PDFs for the input quantities and the mode] that relates them to 
the output quantity. The best estimate y and the uncertainty u(y) are related to 
the PDF for Y in the same manner as described for input quantities. 

The functional form (shape) of the PDF for the output quantity Y needs to 
be known for determining the so-called expanded uncertainty U,. The latter is 
used to indicate an interval about the best estimate y that - based on the informa- 
tion given - contains a large fraction p of the distribution of values that could 
reasonably be attributed to the output quantity Y. 

The standard GUM procedure for evaluating uncertainty is valid for linear 
models. In this case, the PDF for the output quantity is normal (Gaussian) unless 
one encounters one or few dominating input quantities with non-normal PDFs. 
Frequently occurring examples are a rectangular PDF or contributions from 
quantities assessed by only a small number n of repeated measurements. 

The standard procedure of the GUM uses the sample mean as best estimate 
x for the value of X and the experimental standard deviation of the mean as un- 
certainty associated with x and infers the expanded uncertainty from a t- 
distribution with v=n—1 degrees of freedom. This approach is justified if the 
PDF for the measured input quantity is a Gaussian. 

The bootstrap approach may provide an alternative to the standard Type A 
evaluation of uncertainty. The bootstrap is known to give reliable estimates of 
estimators and their uncertainty for large sample sizes for any PDF and it has 
been used in metrological applications since many years [2]. 


This paper examines the feasibility of the bootstrap approach for small 
sample sizes. It describes briefly the ¢-distribution and bootstrap approach, pro- 
vides computational tools for fast bootstrapping that allow the total bootstrap 
and median to be enumerated. It compares systematically both approaches to 
Type A evaluation by taking many samples of a given small size from PDFs 
encountered in practice. Finally, it presents conclusions drawn from the com- 
parison. 


2. Approaches for Type A evaluations 


2.1. The t-distribution approach 


The GUM [1] uses an appropriate t-distribution for the quantity X if the infor- 
mation on possible values of a X consists of a set of n values {x),.., xa} obtained 
from repeated measurements under repeatability conditions. 
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In this approach, the mean value x of the measured values is the best esti- 
mate for the value of the quantity X and the experimental standard deviation of 
the mean is the uncertainty u(x) associated with that mean from the set of n val- 
ues {X),.., Xn}: 


ly 2 Il < 2 
a D and u Paa (1) 


i=] 
Based on Bayesian probability theory [3], one obtains a t-distribution with 
v=n—l degrees of freedom for the quantity X, if one has the additional informa- 
tion that the measured values x; are drawn from a Gaussian PDF. The t- 
distribution with v degrees of freedom and the transformation to the PDF for X 
is given by: 


rf 22") Ban 

| 2 ry]? E-x 
pic) ETA iss. 2 
gr (t) TA 4 | and 7 10) (2) 


where z and & denote possible values. The PDF for X is analogously expressed 
by gx(é) and also denoted by X°. 

The ¢-distribution is symmetric and the boundaries of the coverage interval 
for a coverage probability p, can be derived from 


i 
p= fer (e) dr and U, =[x-u(x) ae x+u(x) t. (3) 
-7 


py 


We use p=0.95 in this paper and set x—u(x)tp,»= to.025 and x+u(x)tp v= fo 975- 


2.2. The Bootstrap approach 


The nomenclature used in this paper attempts to follow the GUM. This implies 
that the bootstrap nomenclature is to be adjusted in some details. 

In practice, the PDF that can be reasonably be assumed to underlie the dis- 
persion of the measurements is not necessarily normal. The bootstrap approach 
does not require any information on the PDF from which the measured values x; 
are drawn. It was designed to estimate the reliability of any estimator [4]. It is 
justified by the plug-in principle that is visualised in Figure 1. This principle can 
be adopted for any statistics, not only for the mean. For instance, with a simple 
formula it is possible to compute the estimate of the variance of the median. The 
principle states that the bootstrap replications of any statistics applied to boot- 
strap samples mimic the statistics of the corresponding output PDF. A bootstrap 


° Latin letters denote quantities (upper case) and the best estimates of their value (lower 
case). Lower case Greek letters indicate possible values. 
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sample is generated by re-sampling with replacement from the measured sample 
(data set) of n values {x),.., Xn}. We denote the bootstrap sample by {6,,...d,3,5 
the subscript r runs from 1 to R. For each of the R bootstrap samples one com- 
putes a bootstrap replication of the statistics of interest, ie. in this paper the 
mean value x,. The bootstrap replications of x, mimic the “PDF” for the mean. 
We calculate the arguments boo25 and bo975 at which the cumulative “PDF”, 
termed CDF, of the mean reaches the probability indicated by the indices. These 
values give the endpoints of a probabilistically symmetric 95 % coverage inter- 
val, but not in general the shortest 
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Figure 1. Visualisation of the plug-in principle [4] and nomenclature used in this paper. 
The bootstrap samples are usually numbered from 1 to B but from 1 to R in this work 
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Figure 2 illustrates and compares the f-distribution and bootstrap approach 
to infer the distribution of the sample mean for a sample of 5 and 8 simulated 
measured values. The t-distribution has been computed using Equation (3). The 
bootstrap will be explained in detail below in Section 3. 


i The notation “PDF” indicates that it is mimicked. A mimicked bootstrap PDF is not 
continuous but a set of at most (2n-1)!/n!(n-1)! discrete increasingly ordered values. 
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Figure 2. Bootstrap and f-distribution approach for the sample mean of samples {x,,.., x5} 
(upper part) and {x,,.., x3} (lower part). The values x; (drawn from a Gaussian PDF) are 
indicated by broken lines. The smooth curves are the cumulative ¢- and total bootstrap- 
distributions (CDF) of the sample mean. For fo 995 , fo.975 » 5o.o25 and 59975 see the main 
text. 


3. Computational tools for a systematic comparison 


To study the feasibility of the bootstrap approach, it is necessary to examine a 
large number M of samples in order to obtain a representative data basis. In or- 
der to demonstrate that the bootstrap works for any distribution it is necessary to 
consider several PDFs that might occur in practice. In this paper we consider a 
Gaussian (G), rectangular (R), triangular (7) and U-shaped PDF (U) and use the 
generic term parent-PDF for these PDFs. Furthermore, we study sample sizes 
ranging from 4 to 8 and compute for each sample the expanded uncertainty 
based on the f-distribution and the bootstrap approach. 

To determine the expanded uncertainty using the /-distribution approach we 
compute Xm and u(x») for {x),.., Xn} m from Equation (1); m runs from 1 to M and 
indicates the sample (simulated set of measured data). Equation (3) then delivers 
fo.025 and % 975. The values of t,3 to t)7 are 3.182, 2.776, 2.571, 2.447 and 2.365. 

The calculation of expanded uncertainty using the bootstrap approach re- 
quires more effort. For each sample of measured values {x;,.., x,} one needs to 
calculate R bootstrap samples {&,n}, and generate bootstrap replications of the 
mean Xm. The bootstrap replications mimic the “PDF” of the mean from which 
we calculate boo25 and bo.975. to.025, 40.975, Do.025 aNd boo7s have the same physical 
dimension as the quantity X. 
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Without loss in generality we use the parent-PDFs for X in standard form, 
ie. x=0 and u(x)=1. The objective of the study is then reduced to examine 
whether the intervals [to.025; f0.975|m and [60.025, 50.975] m encompass the value zero. 
We denote the number of samples for those fo 925 > 0 by M,- and define the ratio 
of M,- to M as the probability p,- for violating the “t-criterion”; pm, pe- and pp-+ 
are defined analogously. 

The number M of simulated sets of measured data and the number R of 
bootstrap replications that are needed to achieve representative results depend 
on the method of calculation and the sample size n. 


3.1. “Brute force” and total bootstrap 


Let us first consider a “brute-force” approach. In that case one uses Monte 
Carlo to generate M samples and again Monte Carlo to generate R bootstrap 
replications for each sample. The trial numbers M and R would be determined 
by the usual convergence criteria; however usually M2 10° and R> 10° [4]. 


This “brute-force” approach quite long computing times for the calcula- 
tions themselves and the convergence study. For small sample sizes n one 
achieves a considerable reduction of computing time by using the total boot- 
strap instead of Monte Carlo. To some extent it is also possible to replace the 
Monte Carlo simulation of the samples (sets of measured data) by an considera- 
bly faster method that will be described briefly in Section 3.3. 


The total number of bootstrap samples and replications R for a given sam- 
ple of size n is equal to n” (sampling with replacement!), e.g. for n=8 one ob- 
tains R=16 777 216. However, many of these bootstrap samples consist of the 
same values. Consider for example the bootstrap sample {¢)=),..., €g=x9; 8}, ; 
there are 40 320 (i.e. n!) permutations that yield the same value for x,. The idea 
behind the total bootstrap is quite simple: one uses only one bootstrap sample 
and weights its contribution according to the number of possible permutations. 
In the example above one obtains as weight n!/n”. The reduction of R is consid- 
erable since Riot = (2n—-1)!/(n!(n—1)!) whence, for n=8, Riots = 6 435. In our 
study it is quite important to represent the tails of the “PDF” of the bootstrap 
replications. These are exactly represented by the total bootstrap but, since the 
tails have very smal! weights, a large R would be needed when using Monte 
Carlo for the generation of bootstrap samples. 


To enumerate the total bootstrap, we consider the values x; of the sample 
{X7,... Xn}m as n “peas” with n different colours, simply denoted by 1, 2,..,n, and 
assume an urn that contains an equal but indefinitely large numbers of peas of 
each colour. A bootstrap sample is then realized by drawing n peas from that 


128 


urn. We denote a possible set of coloured peas by [f,,/,...,/,] and we want to 
form all sequences of n digits drawn from {1, ..., n} that are non-decreasing, 
e.g., when n = 3, the sequences are 111, 112, 113, 122, 123, 133, 222, 223, 233, 
333. The process is one of counting, from 1...1 to n...n, but omitting all num- 
bers containing digits that decrease. We could also say that we apply the multi- 
nomial theorem to count the permutations. Thus, with n = 5, there are 
51/(1!2!10!0!2!) = 30 permutations of 12255. We use a simple algorithm that car- 
ries out the drawing of the Rota bootstrap samples of interest. The algorithm for 
the enumeration of the total bootstrap is demonstrated in Table 1 and consists of 
the following steps: 
© Start with [1,1,...,1], Ze. in all positions are peas of colour 1. 
@ Increment by one the colour number in the rightmost position jẹ that 
contains a color number of less than n. Repeat that step until the colour 
n in position jg is reached, and then proceed to step ©. 
© Find the rightmost position jẹ that contains a colour number of less 
than n and increment by one the colour number in position jẹ by | and 
set the colour number in position jg equal to that in position jẹ. If pos- 
sible, proceed with step @; otherwise proceed with step @. 
@® Stop if neither step @ nor step ® is possible jẹ, ie. when [n, n,...,n] is 
reached. 


Table 1. Algorithm for the generation of the total bootstrap and the total median. A sam- 
ple of size n=4 is used. For lack of space, only a subset of the R entries, ordinally num- 
bered by r, is shown. The number of permutations m, divided by n” is the probability 
occurrence. Column C displays the step 






pe 1 2a @ 


Consider as an example for step @ the step from r=l to r=2, ie. from 
[1,1,1,1]; to [1,1,1,2]..1n Table 1: jg=n, and the colour number in position jg is 
raised from | to 2. Consider as an example for step @ the step from 7=4 to r=5, 
i.e. from [1,1,1,4], to [1,1,2,2]s. in Table 1: jg the value 3 in step @; therefore, 
the colour number in position jg is raised from 1 to 2 and the colour number in 
position jg.1s set to 2. 
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Figure 3. Comparison of total and “brute force” bootstrap. The left figures pertain to n=5 
and the right ones to n=8. The figures show the “PDF” of the sample mean mimicked by 
the total bootstrap (bottom) and (top) the ratio of the “PDFs” mimicked by the total boot- 
strap to that mimicked by the “brute-force” bootstrap. The corresponding cumulative 
distributions were shown in Figure 2. The bootstrap gives always gives discrete values 
(see footnote d), but the resolution in graphical representations is too coarse to show that. 


Figure 3 compares the total bootstrap for samples with n=5 and n=8; Rotal 1S 
126 and 6435, respectively. For the normal bootstrap we used R = Rota- The 
figure shows clearly that the total bootstrap gives a far better representation of 
the tails of the PDF for the mean value. Using Monte Carlo we computed for 
many samples the probabilities for violating the t-criterion and the b-criterion 
using the total and the normal bootstrap and produced frequency distributions of 
the corresponding ratios. The results showed that the total bootstrap is much 
faster and more reliable for the studied sample sizes. 


3.2. Total bootstrap and the median 


Following an idea of Cox and Pardo [5], we also calculate the weights of an 
increasingly ordered sample for the determination of the median and the uncer- 
tainty associated with it. We reproduce their result for a sample size n=7 and 
extend the calculation for sample sizes up to 17. The median of an ordered odd 
sample is the ((7+1)/2)" element. The total median can be represented in this 
case by a simple vector of weights Pr oaa with the elements p;,,. The total median 
of an ordered odd sample is symmetric, i.e. Din=Pn-in. The results for odd sample 
sizes are: P13 = 0.25926, p23 = 0.48148 and the results for odd sample sizes from 
n=5 up to n=17 are shown in Table 2. 
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Table 2. Weights of the total median for odd sample sizes. Since the total median is in 
this case symmetric only the first (7+1)/2 element are displayed. 


pa | po | pm | pm | paus | pm | 
2 
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Using the data provided in Table 2, one can compute the total median and 
the uncertainty associated with it for odd sample sizes from Equation (4): 


n n 
2 
X median,odd = > Pix; and u Caer )= 2 Pin (x, — X median,odd Ve (4) 
j=l i=l 


The median of an ordered even sample is the mean of the central pair, i.e. of 
the (n/2)" and (n/2+1)" element [5]. Therefore, to represent the results for even 
sample sizes one needs a Matrix P,, even With the elements p ija, where i is the 
(n/2)" and j the (n/2+1)" element of the increasingly ordered sample {x;,.., Xp}. 
Obviously, the elements p ;;<;,; do not exist since the samples are ordered. The 
matrix is therefore an upper triangular matrix. It is symmetric about the anti- 
diagonal elements p; mijns L€., Pijetinjya= Pav ijn lia 

The value of the median and the uncertainty associated with it can be com- 
puted from Equation (5a) and Equation (5b): 


X median,even oe SE pa Zb +x} (5a) 


j=l j= 


no on 2 
u’ ewe F ` > Pijn + (x, + x; ~ ve : (5b) 


i=l j=i 










Equation (5a) can be re-ordered to 
n ] n i 
X median,even = SiS Pi, jn t$ pu fe (5c) 
i=l j=i k=} 
The term in brackets in Equation (5c) is the sum over the i" row and the i” col- 


umn of Matrix Py even. We denote half of this sum by p’,, and replace Equa- 
tion (Sa) by Equation (6): 


n 
r 
X median,even — D> be (6) 
i=l 
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Table 3a shows the weights p’;, for sample sizes from n=4 to n=16 and Ta- 
ble 3b shows the elements of the median Matrix Pg even. Unfortunately, a corre- 
sponding rearrangement of Equation (5b) is not possible. 


Table 3a. Weights of the total median for a sample sizes from m=4 to n=16 for use with 
Equation (6). The weights not shown can be inferred from the symmet 
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|o oasas [0.15712 | 0.09594 









6 
i7 | 0.20118 | 0.15205 





Table 3b. The total median matrix P for a sample of size n=8. 
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Figure 4 shows for the samples already used in Figures 1-3 the CDFs for 

the total median using the data in Tables 3a and 3b and compares them with 
CDFs for the mean mimicked by the replications from the total bootstrap. As 
expected from theory, the median distribution is “broader” and reflects some 
“memory” of the sample values. 
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Figure 4. Comparison of the CDF for the total bootstrap replications of the mean (histo- 
gram) and the total median (step-wise linear) for a sample of size n=5 (top) and n=8 (bot- 
tom). The values x; (same as in Figures 2 and 3) are indicated by broken lines. The thin 
solid lines indicate bp o25 and bo97s. The thick lines indicate the arguments where the cu- 
mulative median distribution reaches the values 0.025 and 0.975, respectively. 


3.3. The bootstrap approach and discretely represented distributions 


The benefit in using the total bootstrap algorithm has been a considerable reduc- 
tion in the computing time. We found a very simple method to simply extend 
the ideas used for enumerating the total bootstrap. We consider Mag positions 
that we interpret as measured values. Our urn contains now an equal but indefi- 
nitely large numbers of peas of each colour for any of the Mag positions (col- 
ours). The algorithm produces for each draw a set of n possible position num- 
bers ranging from 1 to Aag. Taking n=4 as an example, the first such set is 
[1,1,1,1] indicating that the simulated sample consists of 4 values pertaining to 
position 1 and the last such set is [Harg, Marg Margs Marg], indicating that the simu- 
lated sample consists of 4 values pertaining to position Mg. The resulting sam- 
ple number Miota = (Margtn—1)!/(n!(Marg—1)!). The positions are then interpreted 
as mid values of intervals such that the integral for a chosen PDF over each such 
interval takes the value 1/na,. The advantage is, that the set of positions needs to 
be computed only once. The values, i.e. the mid values, assigned to the positions 
need to be calculated for any PDF only once, too. The method works quite well 
if Marg = 81. But the computing times increase beyond those needed when using 
Monte Carlo. We used this approach successfully to check Monte Carlo results 
for the PDFs for sample minimum and maximum. 
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4. Systematic comparison of ¢-distribution and bootstrap approach 


We used Monte Carlo for simulating the samples of repeated measurements and 
the total bootstrap for mimicking the “PDF” for the sample mean and computed 
the probabilities for violating the ¢-criterion and the d-criterion, ie. P- Pu, Pi- 
and p,. for sample sizes ranging from n=4 to n=8. The results are shown in Ta- 
bles 4 and 5. By virtue of the symmetry of the problem one expects that p,-=pr 
and p»-=p»:. This was confirmed by the calculation. The uncertainty associated 
with the values shown, e.g. u((p,+p,+)/2), is at most 0.01. The uncertainties have 
been determined by means of splitting any Monte Carlo run in 100 sub-runs. 

From theory one expects that the ¢-criterion is only violated in 2,5% of all 
samples, if the parent PDF is a Gaussian. This was confirmed by the calcula- 
tions; see Table 4 (upper part). The Table shows also that this is not true for 
other parent PDFs and that a trend is evident from the parent PDF G to U. The 
deviation decreases as expected with increasing sample size. 

It is known from theory that the 6-criterion should only be violated in 2,5% 
of all samples for any parent PDF, if the sample size is large. The results pre- 
sented here for small sample sizes show strong deviations and exhibit a trend 
that is somehow inverse to the one seen with the f-criterion; see Table 4 (bottom 
part). Some explanation for these observed trends can be seen in Figure 5, 
which shows the distributions of 49.925 and 49975 for the parent PDFs G and U, 
and from Table 6. 


Table 4. Computed probability for violating the ¢-criterion (upper part) and the b- 
criterion for various parent-PDFs and sample sizes. The numbers presented are 
(P+ P)/2 or (P»-+P»+)/2 multiplied by 100%. The first column indicates the parent-PDF 
and the other columns pertain to sample sizes indicated in the first row. 
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Figure 5 shows clearly that the distributions for to 975 and bo975 pertaining to 
the parent PDF G are similar to a Gaussian. The distributions pertaining to the 
parent PDF U are asymmetric. This asymmetry decreases with increasing sam- 
ple size. 
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Figure 5. Distribution of fo 975 (left figures) and bo 975 (right figures) for sample sizes n=5 
(top) and n=8 (bottom) for the parent PDFs G (shaded) and U. 


Columns 6 and 7 in table 5 show that the distributions for fo 975 have a larger 
variance than those for bo 975. The difference in variance decreases with increas- 
ing sample size, too. Columns 2 and 3 in table 5 show that the expanded uncer- 
tainty as calculated with the bootstrap is always smaller than the “width” of the 
sample, i.e. x,—x;. Columns 4 and 5 in table 5 show that the ‘distribution ap- 
proach uses information, i.e. the parent PDF is G, that can not be inferred from 
the measured samples. 

Finally, Figure 6 shows the distribution of the maximal value Xmax in a given 
sample. The distribution is termed S and is shown for the parent PDF G and R. 
Here R and not U is shown, since the curve for U would dwarf that for G. The 
properties of R discussed below are in general more pronounced for U. The dis- 
tribution of the minimal value Xmin are obtained by mirroring on the =0 axis. 
The shapes of S using G as parent PDF are not Gaussian but exhibit a long tail 
to high values. The shapes of S using R as parent PDF are even less Gaussian 
and exhibit a cut off at 3%. From this one can infer, that the mean width of the 
intervals [Xmin, Xmax] is larger for G than for R and even more so for U. On the 
other hand, the sample variance is the same for all parent PDFs. This explains 
the values for (t,-t_)/(%,-x1) (t-t )/(x,—-x1) and (b,—b_)/(x,-x,) in Table 5. 
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Table 5. Computed expectation values of ratios of intervals extended by a simulated 
sample [x,-x,], the expanded uncertainties computed using the ¢-distribution and the 
bootstrap approach, i.e. [to 97s—fo.025] and [bp 97s—b0.025{ For more details see the main text. 


z= (50.975~0.025)/(%n-*1) (to.975—-to.02s)(Xn-¥1) | (to.975~F0.025)/(b0.975—-40.02 
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Figure 6. Distribution of Xmax» ie. the largest value of a given sample, using the parent 
PDFs G (shaded) and R as computed using the total bootstrap on M samples of measured 
values simulated by Monte Carlo. The sample sizes are n=5 (left) and n=8 (right). The 
largest possible value for če R =3*. In case of U or T one obtains 2” or 6%, respectively. 


5. Summary and Conclusions 


In practical metrology, the number of repeated measurements is often small due 
to the experimental conditions or the high costs of measurements. Generally, 
sample sizes of n< 30 can be considered to be too small to provide a reliable 
determination of uncertainties. 


The performance of the bootstrap and f-distribution approach for the deter- 
mination of the expanded uncertainty has been compared. A fast algorithm for 
enumerating the total bootstrap and the total median and tables for the latter for 
sample sizes from n=4 to n=17have been presented. 

The fast algorithm allowed a systematic comparison of the two approaches 
for the determination of the expanded uncertainty to be performed within rea- 
sonable computing times. Monte Carlo was only used to simulate measured 
data, here termed samples, whereas the bootstrap samples were computed using 
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tables for the total bootstrap. The sample sizes studied range from n=4 to n=8, 
le. the degrees of freedom from 1=3 to v=7. The dependence of the expanded 
uncertainty calculations on the probability distribution function (PDF) for the 
measured quantity was also examined by considering four parent PDFs, a Gaus- 
sian (G), rectangular (R), triangular (7) and a U-shaped PDF (U), that are of 
relevance in practice. 

To conclude, the ¢-distribution approach produces reliable expanded uncer- 
tainties if the parent PDF is a Gaussian; however, the information on the parent 
PDF cannot be inferred from the measured data. For other parent PDFs it pro- 
duces intervals that contain the mean value with a probability of less then 95 %. 
The difference depends strongly on the parent PDF and decreases with increas- 
ing sample size. 

The bootstrap approach produces intervals that contain the mean value with 
a probability of less then 95 %. The difference decreases with increasing sample 
size and is larger for G and smaller for U. 

The interval [%.025, f0.975] is much wider than [bo 025, 50975], especially for 
small sample sizes and therefore overly conservative. 

In this paper only symmetric parent PDFs have been studied. It is an open 
question how well the bootstrap works for small sample sizes when the parent 
PDF is of skewed type that also occurs in metrological applications. 
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We consider the problem of approximating a continuous real function known on 
a set of points, which are situated on a family of (straight) lines or curves on 
a plane domain. The most interesting case occurs when the lines or curves are 
parallel. More generally, it is admitted that some points (possibly all) are not 
collocated exactly on the lines or curves but close to them, or that the lines or 
curves are not parallel in a proper sense but roughly parallel. The scheme we 
propose approximates the data by means of either an interpolation operator or a 
near-interpolation operator, both based on radial basis functions. These operators 
enjoy, in particular, two interesting properties: a subdivision technique and a 
recurrence relation. First, the recurrence relation is applied on each line or curve, 
so obtaining a set of approximated curves on the considered surface. This can be 
done simultaneously on all the lines or curves by means of parallel computation. 
Second, the obtained approximations of the surface curves are composed together 
by using the subdivision technique. The procedure gives, in general, satisfactory 
approximations to continuous surfaces, possibly with steep gradients. 


1. Introduction 


We consider the problem of approximating a continuous function f from 
R? to R known on a set of points Sn, which are situated on a family of 
(straight) lines or curves on a domain D C R?, convex and bounded. The 
most interesting case occurs when the lines or curves are parallel. In general, 
we can admit that some points (possibly all) are not collocated exactly on 
the lines or curves but close to them, or that the lines or curves are not 
parallel in a proper sense but roughly parallel. Clearly there is a structure 
to such data, but it is not required that the data distribution on each line 
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or curve shows a special regularity, that is, the points can be irregularly 
spaced and in different positions on each line or curve. A frequent feature 
of this kind of data, often called track data, is that two points which are 
adjacent to each other along a given track are much closer together than 
points on different tracks. 

As a matter of fact, in several applied problems the function values are 
known along a number of parallel lines or curves in the plane, as in the 
case of ocean-depth measurements from a survey ship or meteorological 
measurements from an aircraft or an orbiting satellite. These data are 
affected by measurement errors and, generally, taken near to rather than 
precisely on straight or curved tracks, owing to the effects of disturbing 
agents (wind, waves, etc.). Moreover, similar geometric structures of data 
arise in many physical experiments which investigate the dependence of 
one variable on another for each of a number of discrete values of a third 
variable. 

Several methods (see, e.g., [4, 5, 8, 10, 11] and references therein) have 
been proposed to solve some restricted formulations of the considered prob- 
lem by using different approximation techniques (approximation or inter- 
polation, as opposed to approximation) and tools (tensor-product splines, 
least squares, radial basis functions, Chebyshev polynomials of the first or 
second kind, etc.). 

The present paper concerns mainly the approximation of a function 
whose coordinates are specified at points on or near families of parallel 
or roughly parallel lines or curves. A variety of boundaries to the data are 
admissible, so it is not required to transform boundaries to yield rectangular 
regions. Functions to be fitted are continuous, possibly with steep gradients. 
It is noteworthy that our approximation method does not require in itself 
any data structure, namely, it applies to scattered data. So, it can be 
applied to data on a family of irregularly distributed polygonal curves such 
as the bathymetry ship tracks in the Marianas trench considered in [5]. 

The approximation is obtained by means of a near-interpolation op- 
erator or an interpolation operator, both based on radial basis functions. 
Precisely, the near-interpolant is obtained by introducing a shape parameter 
in a cardinal radial basis interpolant, so that the resulting formula will not 
be interpolating in a strict sense, but tends to the interpolation formula as 
the parameter goes to zero [6, 7]. On the other hand, the shape parameter 
makes the formula more flexible and allows extension of the set of functions 
to be approximated. In Section 2 we define jointly the near-interpolation 
operator and the interpolation operator, giving a constructive way to ob- 
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tain them. Moreover, we point out two basic properties of these operators, 
which can be easily used to construct algorithms for iterative and parallel 
computation. Section 3 is devoted to discuss practical aspects of the appli- 
cation of the operators to approximating surface data on a general family 
of lines or curves. Section 4 considers special devices to be adopted when 
the lines or curves are parallel or roughly parallel. Numerical experiments 
are carried out in Section 5 by considering classical test functions, defined 
on two sets of track data (lines and exponential curves), and also on two 
other sets, obtained by perturbing the previous ones. Moreover, a practical 
application of the operators is showed, which concerns the reconstruction 
of a surface of a stone ashlar in a Roman theatre. At last, in Section 6 
some conclusions are drawn. 


2. Interpolation and Near-Interpolation Operators 


Let Sn = {z;, i = 1,2,...,n} be a set of distinct points (nodes) in a domain 
D C R? with associated real values {f;, i = 1,2,...,n}, and let a(z,y;r), 
with x,y E€ D and r > 0, be a continuous positive real function such that 


lim a(z, y; r) = a(z, y), 
where a(z,y) >0 if 24 y, a(x,y) =0 if z = y. Define the basis functions 


Ik= 1,kžj O a(z, Tk; r) 


nE he Tk kth Q(T, TkT) 


and the operator 


n n Il. $ 
k=1,kżj (£, Zk; T) 
n=) f g(zir)=>_ f 7 pTi —_-- 
j=l j=l 


De lLe=1,k#h OL, Tk; T) 


or equivalently 


lalt zir) 
-Xi T 


Fs, (4130) = fi, a ETE A 


(1) 


If r = 0, then Fs, (x) = Fs, (x;0) is an interpolation operator to 
the underlying function f(x) at the nodes z;, (i = 1,...,n), and the 
gj(£) = gj(z;0) are cardinal, that is, 9;(z;) = 6i;, where 6;; is the Kro- 
necker delta. On the other hand, if r 4 0, the operator Fs, (2; r) is no longer 
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interpolating, but can be considered, for small values of the parameter r, 
as a near-interpolation operator. Note that 


TL 
Fs, (a;r) = Fs,(2) + >> filgj(asr) — 95 (2)]. (2) 
j=l 
Many choices are possible for the function a(z,y;r). Nevertheless ex- 
perience suggests to identify a with a radial basis function 


a(z,y;r) = o(\|z—-yll? +r), 


where ||- || is a convenient norm, generally, the Euclidean norm ||- ||2. Two 
of the standard functions commonly used in radial basis function approxi- 
mation are 


ġı = læ- ylł+r, $2 = expļollz - yl +r], p>0. (3) 


The operator Fs,„(x;r) just defined, including its special case Fs, (x), 
enjoys many interesting properties |1]; we remark on two of them. A sub- 
division technique can be applied, achieving noteworthy results, very 
well suited for parallel computation [2]. Let us make a partition of the 
set of nodes Sn into q subsets Sn,, so that the jth subset, (j = 1,...,q), 
consists of the nodes £j1, 252,...,2jn,, with ni + n2 +: +Nng = n, and 
the values fjk;, (j = 1,...,q; kj = 1,...,nj), correspond to the nodes £jg;. 
Then F's (x;7r) can be rewritten in the form 
an 4) 

j 


q 
Fs ne 8 ae ` Fs, (2;r) 
gael j=l 


where 
Nj 
AS 1/a(z, 254,57). 
k;=1 


As a particular case, i.e., for q = 2, the following multistage procedure 
works very well. In the first stage, a given set of nodes Sn, = {z;, i = 
1,..., nı} is considered and the corresponding operator Fs, (x;r) is eval- 
uated. In the second stage, it is required to enlarge the considered set Sp, , 
taking the union of it and another set of nodes Sn, = {2;,7 = 1,..., ng}. 
Now the operator on the union set Sn, U Sno, with Sn, N Sna = Ô, can be 
obtained simply by evaluating Fs,,, (x; 1), related to the added set Sn,, and 
using the relation 


Fs, (a; 17)Ay + Fs, (2; 1) Ae 


FS USmq (257) = A; + Ao 
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where 


nı n2 
A, = ` Haltar) Ap = ` Lala 27); 
i=1 j=l 
and A, is known. 

Setting Sn, = Sn and Sn, = {£n+1}, we have that an additional node 
Zn+ı can be added to the original set Sn by simply combining an extra 
term with the original formula, as expressed in the recurrence relation 


Fs, (2; r)An + fn+1 1/a(z, Ln+15 r) (5) 


F 
2 An LOG, ERFT) 


n+l (z; r) oat 


where 
Tr 
An = ` LOE Bee): 
k=1 


The same formula provides also the tool to eliminate one by one points 
from the original set of data. It is sufficient to solve (5) with respect to 
Fs„(x;r). The possibility of adding or eliminating points from the original 
data set may be useful in some cases (optimal subset selection, interactive 
graphics, etc.). 


3. Basic Procedure 


To approximate surface data on a general family of lines or curves using the 
operator F's, (2;7) we can proceed in several ways, which depend mainly on 
data structure, data errors, and expected results. Omitting details perti- 
nent to specific situations, the basic procedure could be sketched as follows. 
Step 1. The lines or curves are numbered from 1 to q. 


Step 2. The set Sn is partitioned into q subsets S,,,(j = 1,2,...,q), so 
that the points of Sn, belong to the jth line or curve, namely, are on or 
near the line or curve. 

Step 3. All the points of S,,, i.e., 2j1,2j2,...,2jn,, are ordered with 
respect to a direction on each line or curve. 


Step 4. The recurrence relation (5) for F’s,,, (2; 1) is applied on the ordered 
set Sn, for any point x € D. 


Step 5. Approximations obtained for the jth line or curve, with j = 
1,2,...,q, are composed together by the subdivision technique (4), so ob- 
taining an approximation to the underlying surface at the point zx. 


We give some comments on the use of the procedure: 
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(a) Step 4 can be carried out simultaneously on subsets of the family 
of lines or curves by means of parallel computation. Suppose we consider 
q = kp-+r lines or curves, where p+ 1 is the number of processors available 
and 0 <r < k. Then the first k lines or curves are assigned to the first 
processor, the second k lines or curves to the second processor, and so on 
up to the last processor which handles r lines or curves. More information 
on both serial and parallel algorithms on different architecture can be found 
in [2]. There it is shown in particular that under condition of well balanced 
workload the speed-up factor of the parallel computation is approximately 
equal to the number of processors. 


(b) If we let x vary on a suitable discrete set of points on the jth line or 
curve, (j = 1,2,...,q), we obtain an approximation to that curve on the 
surface whose projection is the jth line or curve. Obtaining good approx- 
imations to all curves on the surface corresponding to the plane lines or 
curves of the given family can be in itself of considerable interest in many 
applications, as an example to optimize information on topographical con- 
tour lines in the construction of a digital elevation model. 


(c) Step 5 must be completely executed when the operator is evaluated at 
the points of a regular grid. Approximating the function values on a grid is 
requested to represent the surface on a computer display, but it can also be 
useful in other cases. In fact, remapping the original data into a rectangular 
mesh by means of F's, (x;1r) can represent the preliminary operation in order 
to apply techniques based on tensor-product approximation (see, e.g., [11]). 


Now we discuss briefly how to restrict two crucial shortcomings in the 
approximation by F's (z;r), namely, the occurrence of flat spots at the 
nodes and the dependence of the operator on all the nodes. The aim can 
be reached by using local approximations and mollifying functions (see, 
e.g., [1]). 

To avoid the generally undesirable property that a flat spot occurs at 
each data point, the use of information about derivatives, either given or 
generated from data, is recommended and may result in substituting in 
(1) each functional value f; with any approximation L,(x) to f(x) in 2; 
such that L;(2;) = f; ,(@ = 1,2,...,n). In particular, L;(x) can be the 
truncated Taylor expansion of f(x) about x = xj, up to derivatives of a 
certain order evaluated at the point x;; however, the technique calls for 
additional derivative values that are not normally available as data. A 
more practical solution is to determine a local approximant L;(x) to f(x) 
at the point x;, obtained by means of a moving least squares method using 
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weights with reduced compact support. 

Another computational problem is due to F's, (x;r) being global, that 
is, each interpolated value depends on all the data. When the number of 
data is very large, the calculation of the operator becomes proportionately 
longer and, eventually, the method will become inefficient or impractical. 
If the weighting function means that only nearby data sites are significant 
in computing any interpolated value, a considerable saving in computa- 
tion could be effected by eliminating calculations with distant data sites. 
Several authors have suggested localizing schemes in such a way as to ob- 
tain a weighting function which is zero outside some disk of suitable radius 
centered at each node (see [14]). Hence, to localize F's, (2;1r) one can mul- 
tiply the functions 1/a(z,2,;;r) in (1) by the so-called mollifying functions 
T;(@,2;),(7 = 1,2,...,n), which are nonnegative, have local supports in 
some appropriate sense and satisfy 7;(%;,2;) = 1, as for example the cut 
functions 


73(a, 23) = (1 — || — 23[[3/r5), > (6) 
where r; is the radius of the circle of support at the point z3. 
An interesting alternative to cut functions is offered by the function 


i) a if 0< t< 1/2", 
F = 


7 
0, if t> 1/2", (7) 


where t = ||x||3. In fact, we have 7(0) = 1 and 7(1/2”) = 0; the function 
is convex and its tangent plane at t = 1/2” is horizontal; the localizing 
effect increases with n. Localizing functions like 7(t), possibly with differ- 
ent orders of continuity, may represent an alternative choice to families of 
localizing functions based on cut functions (see, e.g., [17]). 

Summing up, we are led to consider the following general form of (1) 


a Lit) a Th(£, £h) /Q(T, £h; r) ” 
(8) 
Fs (2430) = fi, ol ee F 


4, Special Devices 


The considered basic procedure, generally working with an operator in the 
form (8), can improve in order to handle at best particular data structures, 
mainly with regard to using mollifying functions and local approximations. 
A number of cases are worthy to be discussed separately. 
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Parallel lines. It is not restrictive to suppose that the lines are parallel 
to the abscissa axis, so that the distance between any pair of lines is im- 
mediately determined. Then, evaluating the operator at a point x, only 
the nodes belonging to a convenient disc centered at x are considered, so 
restricting the use of the mollifying function with a remarkable reduction of 
computational effort. This is easily done by calculating directly a relatively 
small number of distances, in order to find the lines nearest to z and the 
points on these lines nearest to the projections of x on the lines. 

To construct a local approximant in each node of the disc, we distinguish 
between two different situations. If the nodes lie on the lines and are 
much closer together than the lines themselves, then a univariate discretized 
Taylor formula of order two can be used. Precisely, the first and second 
derivatives at a node are estimated by means of numerical differentiation 
formulas, involving three consecutive nodes. Otherwise, if the nodes are 
not exactly lying on the lines, one can consider their projections on the 
lines themselves, assuming that the nodes would be on the lines in the 
absence of noise. But it is more cautious to find a second degree polynomial 
which represents the least squares approximation at each considered node 
obtained by means of a number of closest nodes. 


Parallel curves. As in the previous case, the distance between any pair 
of curves can be easily determined and only the nodes belonging to a con- 
venient disc centered at x, the point of interpolation or near-interpolation, 
can be considered, so that also now the use of the mollifying function can 
be restricted. 

To construct a local approximant in each node of the disc, a univariate 
discretized Taylor formula of order two can still be used, provided that the 
curves are sufficiently well approximated by a polygonal curve, the nodes 
lie on the curves, and these ones are not too close together. However, this 
requires particular care near the vertices of the polygonal curve. An alter- 
native way of more general application consists in finding a least squares 
polynomial as described above, and this works well also when the nodes are 
not exactly lying on the curves. 


Roughly parallel lines or curves. Once again, a convenient disc cen- 
tered at x can be determined with relative ease and the use of the mollifying 
function can be restricted. 

To construct a local approximant in each node of the disc, approxima- 
tion by a least squares polynomial is recommended both for nodes on or 
near the lines or curves. 
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Interpolation versus near-interpolation. In many applications, func- 
tion values are subject to errors; hence it is not appropriate to interpolate 
the function at the data in the sense of exact matching, but it seems more 
appropriate to approximate the function or, more precisely, to consider 
near-interpolation. Data requiring near-interpolation, ranging from points 
on a rectangular mesh to arbitrarily scattered, occur in virtually every 
field of science and engineering. Sources include both experimental results 
(experiments in chemistry, physics, engineering) and measured values of 
physical quantities (meteorology, oceanography, optics, geodetics, mining, 
geology, geography, cartography), as well as computational values (e.g., 
output from finite element solutions of partial differential equations). In 
particular, since the problem of constructing a smooth surface from a given 
set of lines or curves appears in many instances in geophysics and geology, 
the near-interpolation operator can meet this requirement. 

The parameter r in the near-interpolation operator F(z;r) in (8) has 
the effect that, in general, the gradient of the rendered surface is not zero 
at the nodes. As a consequence, the surface is considerably smoother than 
for r = 0. However, if r is too small, the first derivatives of F'(x;r) are 
highly oscillating and their values are nearly zero. Clearly, the goal is to 
choose an “optimal” value of r, such that F(z; r) does not present flat spots, 
but, in the same time, it maintains a sufficient computational accuracy, in 
particular at the nodes (see [3] for a discussion on the choice of parameter 
values). 

It should be noted that near-interpolation errors Fs, (xi; r)— Fs, (xi) at 
the nodes in (2) must not be confused with the unknown errors which affect 
the corresponding function values f;. However, it is reasonable to arrange 
matters so that near-interpolation errors and errors on f; are quantities of 
the same order. 

On the other hand, in some cases, it may be interesting to consider inter- 
polation as opposed to approximation. As an example, in [4] the construc- 
tion of tensor-product spline interpolants to various types of data structure 
is discussed. Such interpolants can be constructed efficiently and directly 
for data on a rectangular mesh, but also for data on a family of parallel 
lines and data lying close to a family of parallel lines. 


5. Some Tests and a Practical Application 


Carrying out numerical experiments on a number of test functions, we con- 
sidered two sets of data, actually given in different domains but mapped 
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for uniformity into the square [0,1] x [0,1]. The former set arises from 
measurements by photogrammetric survey on a deteriorated face of an an- 
cient ashlar (see below). The latter is obtained by considering the family 
of curves [16] 


Yp(x) = 0.5(p+1) exp[0.1(2+4-1)]+0.05(2+1) exp[0.5(p+ 1) ~1]+0.5(p—1), 
where p = —1+kh,(k = 0,...,m2—1), and h = 2/(m2 — 1). On each 


curve yp a set of equispaced abscissas gP ) (i=1,..., mP), is considered. 
The values of m? ? and mg are chosen suitably. Moreover, from the sets 


just defined, we obtained two other sets by slightly perturbing both the 
coordinates of the data points and the distances between pair of contiguous 
lines or curves. 

Then, we generated the function values at the points of the considered 
sets for each of Franke’s six test functions [12, 13], using 703 data points for 
the photogrammetric lines and 635 for the exponential curves in the case 
of the original data sets, and almost the same numbers in the case of the 
perturbed data sets. The basic procedure and the special devices, described 
in Sections 3 and 4 respectively, have been applied to reconstructing the 
test surfaces by the operator Fs, (x;r) in (8), considering all the possible 
options, namely, r Æ 0 vs. r = 0, ¢1 vs. $2 in (3), and 7;(x,2;) vs. T(t) 
in (6) and (7) respectively. In every case the achieved performances agree 
as regards to accuracy with those described in literature or expected from 
analogy, but the procedure allows an impressive improvement in flexibility 
and computer time. 

The near-interpolation operator can be considered as a tool to recon- 
struct a lateral surface of a stone ashlar in the Roman theatre in Aosta 
(Aosta Valley, Italy). Over the centuries, the ashlar has been seriously 
damaged from atmospheric agents, suffering a loss of material so that the 
considered surface is actually very rough. The surface reconstruction rep- 
resents a step in implementing the software for a dynamical tridimensional 
representation of parts of the ancient building in order to plan restoration 
strategies. The problem has been considered in details in [9] by using a 
cross-sectional technique and nonuniform rational B-splines (see also [15] 
and [3], where a different technique is employed). 

On a rectangular region of a lateral face of the ashlar, the coordinates 
(x, y, 2) of each mesh point of a grid have been measured by photogrammet- 
ric survey, where (x, y) gives the position of the point in the original plane 
of the face, and z is a measure of the variation with respect to this plane, 
that is, z represents a measure of damage. The dimensions in millimetres of 
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the region are 60 < x < 960 and 81.6 < y < 531.6, whereas —160 < z < 0. 
The points lie on 19 lines and each line contains 37 points. 





Figure 1. Near-interpolation operator reconstruction of the ashlar surface. 


From the data many surface reconstructions have been obtained by 
Fs, (x;r), considering the several options illustrated above. To shorten, 
it results convenient to use the near-interpolation operator, to set simply 
M;(x) = fj, because the surface is not smooth, and to adopt the tentative 
parameters values suggested by experience on test functions. In practice, 


the surface in Figure 1 is a convenient representation of the actual surface 
on the ashlar. 


6. Conclusion 


Given a family of lines or curves on a plane domain, possibly parallel or 
roughly parallel, we consider the problem of approximating a continuous 
real function, known on a set of points situated on or near the lines or curves. 
The procedure we propose approximates the data by means of either an in- 
terpolation operator or a near-interpolation operator, both based on radial 
basis functions. Since the operators enjoy some noteworthy properties, the 
procedure succeeds in reconstructing surfaces with a satisfactory approx- 
imation, but especially shows a considerable flexibility and an impressive 
reduction in computer time, thus offering a very convenient instrument of 
application. 
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Coordinate measuring machines are now widely used to qualify industrial pieces. 
Nevertheless, the actual CMM software’s usually restrict to the determination of mean 
values. This is the case for both the characterization of individual surfaces and for the 
determination of geometrical errors. However, in accordance with quality standards, the 
uncertainty of each measurement should also be defined. At last CIRP seminar, a new 
non linear least squares method has been proposed for that purpose to define the error 
bars of the parameters estimated for each measured surface. These values are deduced 
from the gap between the measured coordinates and the associated optimized surface. 
Our new presentation now extends to the propagation of such uncertainties to the 
determination of ISO1101 tolerances (dimensions and geometrical errors). To illustrate 
this approach, a specification was inspected on a true real industrial piece, with respect to 
the ISO 1101 standard. For this industrial application, different measurement procedures 
were proposed and carried out. The uncertainties of the estimated geometrical errors were 
then evaluated, showing the influence of the experimental method onto the reliability of 
the measurement. This example thus demonstrates the need to optimize the inspection 
process. To conclude our presentation, different aspects of the inspection are finally 
discussed to improve the verification of ISO 1101 specifications. 


1. Introduction 


Different research programs were started during a recent CIRP seminar [1] 
concerning the control of uncertainties in the whole industrialization process of 
a mechanical unit. A generalized principle of uncertainty was thus presented 
[2]. In this topic the uncertainty of measurement is also considered. The 
concept of uncertainty is well known in metrology. It is represented by a 
statistical parameter associated to the result of a given measurement and is 
related to the fact that each measurement is altered by random errors, which 
have to be quantified and reported in all the results or operations which derive 
from the measurement process. 

Generally, only the uncertainties of the parameters that characterize each 
elementary surface are estimated. A lot of research work has already been done 
on this topic. Among these studies, some relate to the uncertainties induced by 
the measuring instrument. Others deal with the data-treatment procedures 
performed after the measurement process. 

In the same context, our paper will show some results obtained through a 
new data-processing method. A vectorial approach of surfaces was used for that 
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purpose. Taking account of the significant increase in the performance of 
computers, no linearization of the equations was however employed, contrary to 
most of the current algorithms. The results are thus expressed in a global 3D 
coordinate system, which remains the same for all the analyzed surfaces. The 
interest of such approach also results in the fact that it allows an automatic 
calculation of uncertainties. These uncertainties are simply derived from the set 
of acquired coordinates, which is considered as a statistical sampling of the 
analyzed surface. The covariance matrix of the parameters that characterize 
each feature is also estimated, thus allowing defining a confidence interval that 
bounds the true real surface. 

If various methods are proposed to define uncertainties on an elementary 
surface, the error bars on the evaluation of geometrical deviations are seldom 
indicated. In accordance with the directives of the ISO TC 213, a new method 
of propagation of uncertainties is therefore proposed. The interpretation of 
functional specifications, expressed through ISO 1101 standards, requires 
calculating distances between different geometrical elements: 


s= Datum surfaces and datum systems (ISO 5459), which are 
mathematical surfaces described by the vectorial geometry: point, line 
or plane. 
« Specified surfaces which are generally known only through a set of 
digitized points. 


The calculation of the uncertainties of a given specification comes thus to the 
evaluation of the error bars of these calculated distances. This operation consists 
in propagating the covariance matrix of the parameters that characterize each 
measured elementary surface down to each derived distance [3]. Such 
propagation is a mechanism that requires data transfer and thus depends on a 
great number of parameters. The aim of our paper is then to show the influence 
of the experimental methods onto the final result. Different measurement 
procedures will therefore be applied to the same work piece. For each 
experimental configuration, the resulting uncertainties will be derived directly 
from the set of acquired points. 


2. Experimental study 


The aim of this paragraph is to define the uncertainties of geometric deviations 
evaluated to check ISO 1101 type specifications. The tested piece is the part 
described in Figure 1. Our industrial partner according to his usual experimental 
procedures digitalized the part. The files of the acquired coordinates were then 
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post-treated through our algorithms to define the error bars of the 
measurements. 













Number of digitized points 






Figure 1. Definition of the studied part and location of the surfaces. 


On the tested piece, the coaxially between two cylinders had to be controlled 
(Figure 2). 





Figure 2. Definition of the specification to be controlled. 
For this type of specification, two types of construction can be considered: 


a The first solution consists in constructing the line defined by the 
centers of the two Datum Circles and then to calculate the distances 
between the centers of the circles that define the specified surface and this 
built line. The distances obtained in this manner have then to be compared 
to half of the tolerance imposed by the designer. 


a The second solution consists in concatenating the two sets of points 
defining the Datum circles 1 and 2, to compute the cylinder that fits the 
resulting data and finally to calculate the distance between the axis of this 
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cylinder and the centre of the circles which define the axis of the specified 
cylinder. 


Following Figure 3 summarizes both constructions. 


Datum Circle 1 


Line Datum 
Datum Circle 2 Z Distance Cylinder eee Distance; 
Circle; Circle, —— 


Distance 


t 
i 

First procedure Second procedure 
Distance 
' 


Figure 3. Checking process 


These two procedures have been implemented. The results derived from the two 
constructions are reported in table 1. Figures 4 and 5 show the screen captures 
of the software developed to perform these operations. 


Table 1. Results obtained by the two types of constructions. 


BAAR Stance d; 
First 
E E gO 
Second | t | 00135 | 49E-03 | 


The standard deviations are deduced from the uncertainties of the elementary 
surfaces. These uncertainties are derived directly from the set of acquired points 
and are then propagated to the calculated distances. The calculated distances are 
finally to be compared to half the tolerance imposed by the designer. 















It has to be noticed that: 


= The mean values of the distances calculated by the two methods are 
identical, 

"The Uncertainty of the distance derived from the first method is almost 

two times larger than the value obtained through the second procedure. 
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This demonstrates that the choice of procedure has a quantitative influence on 
the quality of the result. 


Procedure 1 
re 





Éo- Specification 3 
; -Distances : 3 
r i Distance 1 
ree» Distance 2 
: ‘++ Distance 3 
i- Reference : line 
: ++ Toleranced : point 
ii. Specification 2 
i-- Distances : 3 
‘= Distance 4 
i=- Distance 5 
-Distance & 
+++ Reference : line 
«++ Toleranced : point 
Computed distances 
É- Distance 1 
: i- Name: Distancel 
Moy: 013557879 
-= Sigma: 4.88996558325422E 
z ER : line {cylindre} 
i “wee Elt2 : point (circle) 
iie- Distance 2 
f Distance 3 
i e Distance 4 
1- Name : Distance4 
Moy: 1.355787909851 76E-02 
t- Sigma : 8.05109109618828E-B 
i= ERT : line (cylindre) 
^- ERZ: point (circle) 
&-- Distance 5 
ifi- Distance 6 


Figure 4. Chart of the results (first procedure) 


In that case, indeed, the controlled part will satisfy the tolerance zone of the 
specification whatever the selected method. On the other hand, with regard to 
the quality of the results, the uncertainties are two times greater with the first 
construction than with the second one, while the acquired coordinates are 
exactly the same for both procedures. 
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Figure 5. Chart of the results (second procedure) 


3. Statistical Confidence Boundary (SCB) 


The knowledge of the covariance matrix of the parameters that characterize 
each feature allows defining a confidence interval that bounds the true real 
surface. Such interval, called Statistical Confidence Boundary (SCB), may be 
represented by a three dimensional plot which gives a better understanding of 
the concept of uncertainty. 


3.1. SCB definition 


The SCB defines the position and thickness of the interval that delimits the zone 
where the real surface is included [4] [5]. To illustrate this topic, the confidence 
interval of a plane is presented, as example. The calculation of this SCB uses 
the following procedure: a set of points belonging to the ideal plane fitted to the 
acquired coordinates is considered first. These points are built without any 
uncertainty. The mean distance between each point and the plane are null by 
construction, but its variance defines the variability of the feature adjusted to the 
experimental data. It allows thus calculating, for a fixed confidence level, the 
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statistical interval that bounds the real surface. This interval defines the SCB [6] 
[7]. 

It has to be pointed out that the uncertainty of each calculated distance, and 
therefore the corresponding thickness of the SCB, are function of the spacing 
between the center of the plane, where the covariance matrix is known, and the 
point where the estimation is computed. 

According to the Guide for the expression of uncertainties, the following 
type of propagation law has been used to calculate the variance of each 
distance: 


wo) SZ- ule,a)) -ži P A E T (1) 


i=] j=l 0a; Oa, i=l 0a; i=l j=i+l 0a, Oa, 


Since the parameters a, estimated for the feature fitted to the experimental data 


can be represented by random vectors, the previous expression can be rewritten 
as follows: 


Q,(y) = J QA) J; (2) 


Where: 2A) represents the covariance matrix (n x m) of the entries A; and J, is 
the Jacobian matrix defined there by: 





Vie{l...n}Vpe {ln} (3) 
Va? ER 
p22 OS DD OD. o (4) 
Ga; ôa, ôa, ôa; ôa}? ôa?’ Ba?’ Ba}”””” Ba” 


For a plane normal to direction (Z), and a point, without uncertainty, belonging 
to this plane, the equation of a quadric is found. As the coordinates of the centre 
point of the plane and the components of its normal vector are not correlated, 
this equation is: 

Var(d) = X7 .Var(n,) + Y; .Var(n,) + K .Var(O, ) (5) 


where (7/7) are the coordinates of the point M; of the plane, (O,,0,,0,) are 
the coordinates of the center point of the plane, (, ,n yt.) are the cosines of 


its normal vector, Kis a constant. 

It corresponds to the equation of an elliptic paraboloid. From this expression, 
the bounding Surfaces of the SCB can be calculated for a fixed confidence ratio 
k. Figure 7 shows such SCB obtained for a confidence ratio of k = 2. The 
digitized points and the centre of the analyzed surface have been plotted too. 
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Figure 7: Plane SCB 


The SCB of other features like points, lines, etc. can be determined in the same 
way. In the case of a point, it corresponds to an ellipsoid. The next part of the 
paper will show that the behavior result can be explained by using the SCB 
concept. 


3.2. Explanation of the results using the SCB concept 


The deviations of the two inspection procedures considered in figure 3 can be 
well defined now, using the SCB concept. The intention of the operator is to 
control the distance between the center point (M;) of each circles acquired on 
the specified surface and the axis of the Datum (O,V). The result of the best fit 
of the acquired surfaces is represented in figure 8. On the left part of the figure, 
the SCB’s of the datum circles and one section of the specified surface are 
ellipsoids. On the right part, since the datum is described by a cylinder, the 
SCB of this surface corresponds to a complex quadric. 


SCB Cirele; 










SCB Datum Cylinder 


SCB Datum Circle 1 


Figure 8: Surfaces SCB 


Each distance to control is characterized by its mean value d; and its uncertainty 
Uc, defined by following equations Eq.6 and Eq.7. 
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d, =|OM, av | 
(6) 
Ueta =V U (H, ) + U (datum) (7) 


where: d; is the mean value of the calculated distance, u, qi is the composed 
uncertainty of the specification, u mn is the propagated uncertainty of point M; , 
U (datum) iS the propagated uncertainty of the datum 


The figure 9 shows how the composed uncertainty can be deduced directly from 
the SCB’s defined on the Datum and each circle acquired on the specified 
surface. In the first case the datum is obtained by creating the line defined by 
the centers of the two Datum circles. The SCB of the resulting item is still a 
complex quadric. In the second case the datum is defined directly by the 
cylinder fitted to the acquired points. It is thus clearly demonstrated that the 
cross section of the datum SCB obtained by the first procedure is larger than the 
second one. In consequence, the composed uncertainty deduce from the first 
construction will be much greater than the one calculated through the second 
procedure. 






SCB datum circle 2 






B datum cylinder 


Figure 9: Determination of the composed uncertainty. 


4. Conclusion 


In this paper, quantitative results highlighted that the procedures selected by the 
operators of CMM during the design of the measurement process, have a direct 
consequence on the results of acceptance of the tolerances. The only 
information usually used to make decision is a mean value. It will never permit 


158 


to select between different equivalent procedures the one that gives the most 
reliable result. For that reason, a generic method has been developed to 
calculate the uncertainties of each measured specification directly from the set 
of acquired coordinates. This additional information allows quantifying the risk 
of acceptance or rejection of controlled parts. In case of indecision, the operator 
may then reduce the measurement uncertainties either by selecting constructions 
which will lower the propagation of uncertainties or quite simply by increasing 
the number of points of the digitized surfaces. 
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The in-use uncertainty of an instrument is investigated according to the possible 
measurement procedures, namely: as a comparator or as a standard. These two 
alternatives are embedded in a unique model and the uncertainty components due 
to instrument calibration and noise are discussed. Special attention is given to the 
case of calibration by fitting with a straigth line. 


1. Introduction 


It is a common opinion that using an instrument as a standard or as a com- 
parator are two fully diffentent experimental techniques for determining the 
value of a given quantity X. With the first technique, widely adopted in 
lower-accuracy applications, one uses the calibration curve of the instru- 
ment to directly obtain the measurand estimate X from a reading Y. The 
second technique, preferred in highest-accuracy applications, implies com- 
paring the values x ı and Xs obtained, through the calibration curve, 
from two readings Yı and Y>. These correspond to the measurand X and 
a reference standard, having a known value Xref as close as possible to the 
measurand value, being alternatively connected to the instrument. The 
measurand value is thus estimated by suitably combining the known value 
of the reference standard with the difference (or ratio) of the estimates 
given by the comparator. For example, 


A oi m= Xref + Šı EZ Kiet (1) 


where the subscript “comp” stands for “comparison”. 
In practice, when using the instrument as a standard, it is good practice 
to take into account the possible lack of stability of the calibration curve 
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by means of a “zero” reading, that is, a reading of the instrument response 
Yo, typically different from zero, when a measurand of zero value is fed to 
its input. To this reading an estimate Xo corresponds, which is subtracted 
from X to form the estimate X4, where the subscript “d” stands for “di- 
rect”. Therefore, also in this procedure the measurand estimate involves a 
difference, that is, 


Xa = È — Xp, (2) 


so that the two seemingly different methods can be modelled in similar 
ways. 

The uncertainty provided by method (2) is typically larger than (or at 
the best comparable with) that obtained with method (1). This means 
that the difference term X; — Xer in Eq. (1) has a smaller uncertainty 
than the corresponding term X — Xo of Eq. (2). In both cases, the esti- 
mate uncertainty can be imagined as the combination of two components, 
one accounting for repeatability and the second for traceability to SI units, 
through the reference standard with model (1) and trough instrument cali- 
bration with model (2). Indeed, as it is well known, in the former difference 
the dominant uncertainty contribution is random noise of the comparator 
(and possibly its finite resolution [1, 2]), whereas in the latter both contri- 
butions are embedded. Yet, the only distinction between the two difference 
terms in Eqs. (1) and (2) is quantitative. This suggests that the traceabil- 
ity contribution to the difference uncertainty is somehow proportional to 
the difference value. We have proved this for a linear calibration curve. 


2. The calibration of a measuring instrument 


The mathematics underlying the calibration of an instrument is well estab- 
lished. For a complete overview, see, e.g., [3]. Usually, an instrument is 
calibrated by providing to its input m reference values, or stimula, X; of a 
given quantity X and by observing the corresponding m responses Y;. The 
points thus obtained are fitted by a function 


Y = 9(Q0,01,.--;Qn-1,%), (3) 


the parameters a, of which are to be determined. This estimation problem 
concerns a vector measurand, that is, the parameter vector a. For example, 
in the common case of a polynomial function 


n—l 
Yey aa" (n< m), (4) 
h=0 
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one can write 


y= aX? + aX} +... "AE P © die +£] 
Yo = oX? +01 X} + ... + an-ı XZ + E2 


Ym T OAE T CE E T E Xe! t+ em (5) 
or, in matrix terms, 
Y = Xa+ce, (6) 


by defining Yim = (Yi Y2- Yas Oa | a0 œi ++: Qn—1|, and 
Derr a | x° Ky ea Ae: The error vector £ is such that 
E (£) =0 and E (ecT) = V (Y) = 0°, (7) 


where V (Y) is the covariance matrix of Y and I is the identity matrix. 


That means that the responses Y; are independent and have the same vari- 


ance g°. 


By further assuming that the X vector has negligible uncertainty com- 
pared to that of Y, the estimated coefficient vector & is thus given by a 
standard least squares adjustment 


a= [x7 x] T xTy (8) 


The covariance matrix V (â) is obtained from 
V (4) =JyV(Y)Jy (9) 
where the Jacobian J y is a rectangular n x m matrix whose generic element 
J yi; is J yi; = ay (see, e.g., [4]). 


Application of Eq. (9) to Eq. (8) yields 


V (â) = u? (Y;,) [xTx] ~ (10) 


A 


The analytical form of the estimate â and of its covariance V (â) de- 


pends upon the degree of the polynomial. In the linear case, in which 
1 Xı 


2 
alt= | ao a | and X = |. | |, the well-known solution is [5] 


1 Xm 
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- ĉo = Y — âX (12) 
and 


A 


Q1 = 


d (Xi — X) Y-Y) 
=e 
Nees) 
where X = (X. X;) /m and Y = (X Y;) /m . The estimate covariance is 


a — we) DX; RA A 
ae mI (X;—X) Bose m | ve) 


Correlation between slope and intercept can be avoided by a suitable 
change of variables |1]. In general, this is not possible for higher-degree 
fittings. 

The analytical form of the parameter vector estimate â depends on the 
particular fitting function Y = g(a,X) adopted. This might be as well 
nonlinear in the parameters, which would imply a nonlinear adjustment. 

In several cases the uncertainty of the X vector is not negligible. The 
analysis is then more complicated |6, 7], but in any case this further uncer- 
tainty contribution is embedded into the uncertainty matrix V (â). 


(13) 


3. Using the calibrated instrument 


After calibration, the instrument is routinely used in the laboratory. This 
means that new responses Y; are observed, from which the corresponding 
unknown stimula X; are to be estimated with their uncertainties. The 
model equation in this application is 


Xi = 97" (â, Y;) , (15) 


where g`! 


is the inverse function of g in eq. (3) and the vector variable 
@ is now replaced by its estimate â. From a mathematical viewpoint, 
the existence of the inverse function is subject to conditions. However, 
in the range spanned by the calibration stimula, the function g has to be 
bijective for a meaningful calibration and therefore the inverse g7! exists. 
The X; values are here arranged in a X qx1 vector (q > 2), related to 


-m 


the corresponding Y vector (YT = [Yı Y> +- Yl) through the estimated 
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parameter vector â. To further simplify the problem, let us arrange the 
two vectors Ĝnxı and Y, xi in a column (block) vector W(n4¢)x1, such 
that WT = [â ur vej: In order to evaluate the covariance matrix of x. 
we use Eq. (9), here taking the form 


v (x) =JwV(W)J%,. (16) 
qxq 


It is worth noting that the propagation law (16) is exact for linear functions 
only. For a generic function, such as that of Eq. (15), it represents a 
first order approximation. The covariance matrix V ( W); +4) x (n+) of Ww 
is an appropriate block matrix constructed with the covariance matrices 
V (â) and V (Y) namely 


nxn qxq’ 


(17) 


vim |Y vty) 


Since the vectors â and Y are uncorrelated, the off-diagonal blocks are 
equal to zero. The lower block, V (Y), is diagonal, the upper, V (â), is 
not. By further using block matrices, Eq. (16) may be re-written as 


v(x) = ised) in “an Mae 
= Ja V (&) JE +JyV(Y) JL. (18) 


Therefore, the covariance matrix of X is made of two terms, one related 
to traceability and another accounting for the instrumental noise. The 
generic matrix element V;; has the form 


ai (X,,%;) = D OX; AX; nee 


bân Dan” 
OX; 
+0: (oe) u” (Yj) (19) 
where 6;; = { i ` i 4 : is the Kronecker symbol. As expected, the stim- 


ula estimates are correlated by the common calibration coefficients. This 
result is analogous to that obtained in [8, Sec. 6] concerning the correla- 
tion introduced in mass comparisons by the common balance sensitivity. 
Quite in agreement with common sense, the instrumental noise contributes 
only to the estimate uncertainty and not to the covariance between any two 
estimates. 
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As concerns the derivatives in Eq. (19), we note that, according to the 
theory of implicit functions, 





























a p am 
Oe 2a (20) 
iá OX: Xag (Y) 
and 
s -1 
5i = pg (21) 
eo MOA legie 
In the case of a polynomial fit, one has in general 
m—-1 
VS) Cin’: (22) 
h=0 
and 
n—1l 
Of = PY = a ap X" (1 — n) (23) 
k=0 
Using Eq. (21) with Eq. (22) we obtain 
7 zi 
OX; — Skei 
— WX, 24 
ave X,=9-1(4,¥3) 
The corresponding equation for the derivative with respect to â; is 
ax = 7 
t 2h as i eae a oy a. pkj- 
Dan = KO | -JYiX; S (k — 7) aX; | , (25) 
n= Ri=9-1 (â,Y:) 
which, after substitution of Eq. (22), becomes 
i =i 
ON. a ees 
on Tu b kay X: ' (26) 
uii Ri=g-1(â,Y;) 
that is 
OX; A h OX; 
Si o e 2 
Day, i Oy, |. (27) 
Xi=97~*(4,Yi) 





This result can also be obtained directly from the theory of implicit 
functions. Equations (27) and (24) yield the sensitivity coefficients of the 
X; estimate with respect to the a; coefficients and to the Y; response, 
respectively, and can easily be solved for any polynomial degree. 
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As already mentioned, an instrument is preferably used by taking the 
difference X; —- X j between two stimula corresponding to responses Y; and 
Y;. This difference may vary between near zero values, when the instrument 
is a comparator, or full-scale values, when the value X; is the zero reading 
of the instrument. In order to evaluate the difference uncertainty, one can 
use Eq. (19). However, we follow a slightly different derivation. In matrix 
terms, the generit difference of any two estimates Kae Š; is modelled by 
x= Xx; = d" X, where dii = [0 l;e lj 0]. The resulting 
difference covariance is therefore 


V (a72) =d" V (x) d, (28) 

or 
V (d" x) =d™J,V(a)Jnd+d'JyV(¥)Jbd. — (29) 
By separately developing the two terms of the r.h.s of Eq. (29), we obtain 
dJa V (& J] d=(Jia— Jia) V(& Jia- Jia), — (30) 


where the row vector J i,Qaxn) = xe is the i-th row of Ja. Equation (30) 
yields 

















A aT 
A OX; A OX; 
d' Ja V (â) J} d= eraa 
Wi —i(&,Y;) 
, OX; ax,” 
Oa V (@) ae Oa |. 
ere ey 
y OX ax,” 
ET V (â a) = l (31) 
Xi j=9T iÂ, Yi j) 





This expression is written in the more familiar form 
dTJaV (â) JT d =u? (2:) +u? (X;) — Que (X., X;) _ (32) 
Rearranging of the terms of Eq. (31) yields 
dJa V (â) JI d= 


` S2 _ OX; \ (əki Ək; 
Jå; Oh Oa; Oa, 


h=0 k=0 








u (Gn, Âk) ’ (33) 
Xij=97-1(4,Yi.5) 
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or 
ax 2 
OX, OX; 
T T 3 27A 
d= penta A 
d J&V (a) Jå 5 (SE - a) u“ (ân) 
Xi,j=97*(&,Yi,5) 
n-2 n—l1 ^ 
OX; OX; aX, ee 
+2) 2 (3 3 rare | (Se - Ok | u (in, Âk) - 
h=0 k=h+1 h h k k X: J709 (8, Yi) 





(34) 


An analogous development for the second term in the r.h.s of Eq. (29) 
yields 


2 
OX 
T T “> 2 A 


Xig=971(4,Yi,;) 








(the Y covariance V (Y) being diagonal, the covariance terms vanish). 
The derivatives also vanish for any r Æ 7, j, so that the only two remaining 
terms give 


A 2 x 2 
OX; OX; 


Š: s =9 1 (Y: 5) 
(36) 


It is worth noting that expressions (33) and (36) are quite general, and 
do not depend neither on the specific calibration function (3) chosen, nor 
on the assumption on the uncertainty of the X; stimula used for calibration. 
Only Eq. (36) is based on assumption (7), but can easily be generalised, 
thus taking a form similar to that of Eq. (33). 


4. The linear case 


In the linear case one has 
ne ee (37) 


In this case the derivatives of Eqs. (34) and (36) are easily calculated, and 
Eq. (29) becomes 


z (X, — X;) m ll a (38) 


i 
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The first contribution comes from the instrumental noise. The second 
comes from calibration, and thus accounts for traceability uncertainty. Its 


: ee 
value is proportional to { X; — X;} , as conjectured. It is to be noted that 
the intercept uncertainty plays no role. 

The relative uncertainty w Kix i) takes the form 


w (Xi — X5) = \/ w? (Y; — Yj) + w? (dy). (39) 


In the case of linear fit, the covariance V (x ) can also be obtained 


directly from Eq. (18). By developing calculations, one has 


v (x) = za [e -X] V (â) [er wyatt, (40) 


Oy 
where eT = [1 e. fil Explicit expressions for the covariance matrix ele- 
ments are: 
2 (x) W (Yr) +u? (âo) + XPu? (â1) + 2Xiu (âo, â1) 
u| Xi] = ee aa (41) 
1 


for variance and 
2 A u? (Go) + Š; Xu? (a1) T (x, F R; ) u (Gg, â1) 
u (Ra X5) = ———_________, + __4+. (42) 
C1 
for covariance, from which Eq. (38) follows. Again, note that in the co- 
variance term of Eq. (42), the only contribution comes from the coefficient 
uncertainty, as predicted in Eq. (19). 


5. Higher-order fits 


The expression for the difference uncertainty becomes more and more com- 
plicated as the degree of the fit increases. We report here the case of a 
second-degree polynomial 


a (£ - %;) = (Xi = x) (B;B;)~? 
{463 (âo) + âu? (a1) + la (x, ale X;) + 22 X;X; | ow (G2) + 
+26 dou (Ao, G1) + 2d â (x, + X;) + 2 XiX; | u (Ao, &2) + 
+04 ân (x; + %;) + 2a2XiX;| u (â1,å2)} T 
+ (Br* + B7’) u” (Y+) (43) 
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where B; 3 = â1 + 2X. j- Also in this case the uncertainty contributions 
due to calibration depend on the term X; — X;. The monomial represen- 
tation (4) chosen for the polynomial expansion is known to lead to severe 
numerical difficulties [3], specially when the polynomial degree is higher 
than, say, 3 or 4. Yet, this representation is still widely adopted, an out- 
standing example being the reference function of the ITS90 [9]. Orthogonal 
polynomials have been proposed as an advantageous alternative [10, 11] and 
we think that their properties as concerns the in-use uncertainty deserves 
further investigation. 


6. Comments 


The derivatives in Eqs. (33) and (36) are the slopes of the curve (15), eval- 
uated at the appropriate points. For X; => X j the calibration contribution 
given by Eq. (33) tends to zero and only the comparator noise term (36), 
tending to oe u(Y;) remains. This behaviour holds independent of the 
curve function. In particular, in the case of fitting by a straight line, the 





i CA2 
calibration contribution is strictly proportional to (x -X 3) : 


These results are quite in agreement with common sense. First, the 
different weight of the calibration contribution according to the different use 
of the instrument — as a comparator, thus resorting to an external standard, 
in case (1), or as a traceable standard, thus bearing all the uncertainty 
contribution arising from the calibration, in case (2) — is in agreement with 
everyday experience. Second, the noise contribution from the instrument 
has to be taken twice into account, in both cases. This result, apparent in 
the comparator mode, since two readings are needed, has sometimes been 
questioned for the direct reading mode, in which only one reading seems 
to be necessary. Actually, the preliminary “zeroing” of the instrument 
represents in any case a second reading, whose contribution to uncertainty 
cannot be neglected. 
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This paper describes automatic differentiation techniques and their use in metrol- 
ogy. Many models in metrology are nonlinear and the analysis of data using these 
models requires the calculation of the derivatives of the functions involved. While 
the rules for differentiation are straightforward to understand, their implementa- 
tion by hand is often time consuming and error prone. The complexity of some 
problems makes the use of hand coded derivatives completely unrealistic. Au- 
tomatic differentiation (AD) is a term used to describe numerical techniques for 
computing the derivatives of a function of one or more variables. The use of AD 
techniques potentially allows a much more efficient and accurate way of obtaining 
derivatives in an automated manner regardless of the problem complication. In 
this paper, we describe a number of these techniques and discuss their advantages 
and disadvantages. 


1. Introduction 


There is a wide variety of mathematical and scientific problems in which 
it is necessary to obtain accurate derivative evaluations. We recall that if 
f(x) is a function of a variable x, the derivative 


df 
(a) ot f'(a) 
is also a function of x and represents the slope of (the tangent to) f at z. 
Mathematically, the derivative is defined as a limit 

df, f(at+h) — f(z) 

— = lim =. 

dx h—0 h (9 
The fraction on the right represents the slope of the line passing through 
(x, f(x)) and nearby point (£ + h, f(z + h)). 


* Work partially funded under EU SofTools-MetroNet Contract N. G6RT-CT-2001-05061 
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Using differentiation rules, the derivatives of complex functions can be 
expressed in terms of the derivatives of their simpler, component functions. 
However, even for only moderately complicated functions, the hand calcu- 
lation of derivatives can lead to pages of tedious algebra with an increasing 
likelihood of a mistake entering into the computation. 

If the function evaluation is encoded in a software component, it is nat- 
ural to ask if it is possible to compute the derivatives automatically using 
the function evaluation component. Until recently, the standard method 
of evaluating derivatives numerically was to use finite differences, essen- 
tially evaluating the right-hand side of (1) with h a small, pre-assigned 
nonzero number. This approach generally gives an approximate value. In 
recent years, advances in computers and computer languages have allowed 
the development of a new method for obtaining accurate derivatives of any 
programmable function. The term automatic differentiation’® (AD) gen- 
erally applies to techniques that produce, from a function evaluation soft- 
ware component, a computational scheme, also implemented in software, 
that calculates the derivatives. These techniques have evolved and are still 
evolving both in terms of their theoretical basis and, more markedly, in the 
software engineering aspects of their implementation. In a separate devel- 
opment, attention is now being paid to the complex step method which uses 
complex arithmetic to evaluate accurate derivatives. 

In this paper we describe the main techniques for function differenti- 
ation, giving a summary of their advantages and disadvantages in terms 
of both accuracy and ease of implementation. The remainder of this pa- 
per is organized as follows. In section 2, we give a number of metrological 
application areas in which the calculation of derivatives is required. In sec- 
tions 3, 4 and 5 we give descriptions of the main techniques for calculating 
derivatives and summarize their advantages and disadvantages. Example 
calculations are given in section 6. Our concluding remarks are presented 
in section 7. 


2. Nonlinear problems in data analysis in metrology 


In this section, we consider some of the main examples where derivative 
evaluation is required in metrology. 


2.1. Nonlinear least squares approximation 


In many calibration problems, the observation equation involving measure- 
ment y; can be expressed as y; = ¢;(a)+e€;, where ¢, is a function depending 
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on parameters a = (a1,...,@n)! that specifies the behaviour of the instru- 
ment and €e; represents the mismatch between the model specified by a and 
the measurement y;. Given a set of measurement data {y;}7", estimates of 
the calibration parameters a can be determined by solving 


min F(a) = f2(a) + fila) + + Rla)=S Pa), O 
` i=l 


where fi(a) = yi — ¢;(a). Least squares techniques are appropriate for 
determining parameter estimates for a broad range of calibration problems’. 

A common approach to solving least squares problems (2) is the Gauss- 
Newton algorithm. If a is an estimate of the solution and J is the Jacobian 
matriz defined at a by Ji; = Of;/Oa;, then an updated estimate of the 
solution is a + ap, where p solves 


min || Jp + f|/?, 
P 


and a is a step length chosen to ensure that adequate progress to a solution 
is made. Starting with an appropriate initial estimate of a, these steps are 
repeated until convergence criteria are met. The solution of a nonlinear 
least squares problem therefore requires the calculation of the derivative 
matrix J. 


2.2. Uncertainty evaluation 


If Ua is the uncertainty (covariance) matrix associated with a set of pa- 
rameters a and f(a) is a function of a then the standard uncertainty u(f) 
associated with f is estimated from 


u’ (f) = g" Uag, (3) 


where g is the gradient vector of f, i.e., g = (Of /ðaı,..., 8f /ðan)!. These 
partial derivatives are often referred to as sensitivity coefficients®?. If the 
uncertainty matrix is diagonal with U; = u?, Ui; = 0, i Æ j, then (3) 
simplifies to the perhaps more familiar 


2 2 n 
w= (54) ite (FE) => (35) u. (4) 


Ji 





2.3. Maximum likelihood estimation and numerical 
optimization 


Least squares approximation methods are a form of maximum likelihood 
estimation in which the most likely explanation of the data is sought. For 
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pre-assigned statistical models in which uncertainty in the measurement 
data is modelled by Gaussian distributions with a known uncertainty (co- 
variance) matrix, the maximum likelihood corresponds to minimizing a sum 
of squares’. For more general models in which the uncertainty matrix is 
not known exactly, the maximum likelihood is attained by minimizing a 
function of the form 


F(a,o) =h(o) +` f?(a, 0), 
i=1 


where o are additional optimization parameters that specify the unknown 
input uncertainty matrix®. In this more general setting, F is no longer a 
sum of squares and the Gauss-Newton algorithm can no longer be applied 
(at least with the same success). 

The Gauss-Newton algorithm is in fact derived from the more gen- 
eral Newton’s algorithm for minimizing a function F(a). In the New- 
ton algorithm, the search direction py is computed as the solution of 
Hpyn = —g, where g = VF is the gradient vector of first partial deriva- 
tives, g; = OF /Oa;, and H = V2F is the Hessian matrix of second partial 
derivatives Hj, = 0°F /0a;0ax. 


3. Finite differences 
3.1. Description 


The main and simple idea of the method is to approximate the derivative (1) 
by evaluating f at two nearby points. Typical of finite difference formulae 
are the forward difference formula 


f'(z) = fy = Aca eae A n) J 


and the central difference formula 


pape LOAN =f@-¥) 


Here h is a “step” selected in an appropriate way. 


e 


3.2. Advantages 


The most important advantage of the finite difference approach is that 
it does not require access to the source code for the function evaluation 
component and therefore, from this point of view, it is an ideal method for 
library software. 
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3.3. Disadvantages 


The method provides only approximate estimates. In ideal circumstances 
using central differences we still expect to lose one third of the available 
figures. For poorly scaled or ill-conditioned problems, the accuracy can be 
much worse. For metrology applications with data accurate to 1 part in 
10°, there is a significant risk that finite differences implemented in IEEE 
arithmetic could introduce errors comparable to or larger than the measure- 
ment uncertainty. There are technical difficulties in balancing truncation 
errors with cancellation errors in the choice of step size h. The method is 
inefficient since for a problem of n parameters, it requires between n + 1 
and 2n + 1 function evaluations to calculate all n partial derivatives. 


4. The complex step method 
4.1. Description 


The complex step method is similar to finite differences but uses complex 
arithmetic to provide accurate estimates. Current interest!*14 in the ap- 
proach was generated by Squire and Trapp!® who revisited earlier work by 
Lyness and Moler!?. We recall that a complex number z is of the form 
z = £ + iy where x and y are real and i = /—1. All the arithmetic oper- 
ations for real numbers also apply for complex numbers. Most real-valued 
functions f(z) occurring in science also have complex-valued counterparts, 
and can be written in the form 


f(z) = RF(z) + iSf(z), 
where the real-valued functions Rf and Sf are known as the real and 
imaginary parts. The concept of derivative is defined by the complex version 
of (1) and from this we can also derive the Taylor expansion for a complex 
function. For x real and A real and small, we have 


s . / h? i“ e h’ ft ht IHI 
f(z +ih) = f(z) + ihf (2) - -5f (x) -iz DET (pen, 
Taking the imaginary part, we have 
; t he itt h’ ttii 
Sfer eao T O + ha) + 


and we see that 


f(a) =f, = EEI, (5) 
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with a truncation error of order h?. Unlike the use of a finite difference 
formula, h can be chosen to be very small (e.g. 107'°°) with no concern 
about the loss of significant figures through subtractive cancellation since 


no subtraction is involved. 
The following Matlab instructions compares the CS estimate of the 
derivative of cos gz with sin z. The difference is calculated to be zero. 


>> i=sqrt(-1); h = 1e-100; x=1; d = imag(sin(xti*h)/h); d - cos(x) 


We have implemented°, both in Matlab and Fortran 90, software com- 
ponents that use the CS method evaluate the derivatives of f = f(x), 
x = (£1,..., Zn)", where f is calculated by a user-supplied component 
with x declared as n x 1 complex array. 


4.2. Advantages 


The complex step method can provide derivative information accurate to 
full machine precision. The method is easy to implement. In a language 
that supports complex arithmetic, the user can write a function evaluation 
component as normal, only declaring the relevant variables as complex, 
but care must be exercised with some intrinsic functions. There is no 
requirement to have access to source code, so long as the correct variable 
types have been used. 


4.3. Disadvantages 


As with standard finite differences, for problems involving a large number 
of variables, the method is inefficient. If there are n variables, then n func- 
tion evaluations are required to calculate all n partial derivatives. Existing 
software written in real arithmetic has to be re-engineered to work with 
complex arithmetic. Subtractive cancellation can occur in special circum- 
stances. Only first derivatives can be calculated. 


5. Program differentiation techniques 


Forward and reverse automatic differentiation (FAD, RAD) are two ap- 
proaches aimed at “differentiating the program” that computes a function 
value’. They are accurate in the sense that they apply the rules of calcu- 
lus in a repetitive way to an algorithmic specification of a function and, 
in exact arithmetic, will produce the exact answer. They rely on the fact 
that any program, no matter how complex it is, can be broken down into a 
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finite combination of elementary operators!*!! such as arithmetic opera- 
tions (e.g., +, —) or elementary functions (e.g., sin x, cos x, e”). Thus, the 
evaluation of a function y = y(r1,...,2%n)? can be written as a sequence of 
m, say, operations 


1 Initialise t; = £}, 7 =1,...,n. 
2 te SV (te, 21) Se cee 
3 Set y = tm. 


Here WV, is a unary, binary operation or elementary function. In FAD, 
the derivatives Ot,/Ox,; are calculated as the function evaluation progresses 
according to the chain rule: 


Əta _ Og Öte , IVg dtr 
Oz; E Ot, Ox ; Ot; Ox; 


In a RAD approach, the function is evaluated in a forward sweep (as in 
FAD) and then in a reverse sweep the partial derivatives t4 = Oy/Ot, of the 
output y with respect to the intermediate quantities tą are assembled in 
reverse order, finishing with the required partial derivatives Oy/Ox;. The 
reverse sweep is summarized as: 


1 Initialize t; = 0, i = 1,..., Mm — 1, îm = 1. 
2 Forq=m,m—1,m-—2,... n+ 1, if tg = Vg(te, ti), 
OW, 


eer re E eee eee, 
k -— tk Oth q5 i -— il At; q: 


3. Fori = lve, OY / On =f); 


The cost of computing the gradient of a function of n variables is between 
2 and 6 times the cost of computing the function value itself for the RAD 
while for the FAD it is between n and 4n. 

Many of the derivative calculations involved in maximizing likelihoods 
reported in Cox et al.8 were performed using RAD. 


5.1. Advantages 


Both FAD and RAD provide mathematically accurate derivatives. RAD is 
particularly efficient for problems with large number of independent vari- 
ables. The RAD memory efficiency requirements depend on the value m, 
i.e., the number of operations and elementary functions required to compute 
the function value. Higher derivatives can also be calculated. 
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5.2. Disadvantages 


Both FAD and RAD require some effort to implement. For languages such 
as Fortran 90 that support operator overloading, FAD can be implemented 
by overloading the standard operations and elementary functions with op- 
erations on variable types that store derivative information in addition to 
function values. For RAD, information required for the reverse sweep has 
to be stored as the forward sweep progresses. The FAD and RAD algo- 
rithms require the modification of the declarative part of the source code 
to replace real variables with variables that store the required additional 
information. The rest of the code remains substantially the same. 


6. Examples 
6.1. f(x) = xt cosx- t 


We have applied finite differences (FD), the complex step method (CS) 
and forward and reverse automatic differentiation (FAD, RAD) to provide 
estimates f of the derivative f’ of the function 


1 1 


f(2)=27'cosx7!, f'(æ)= x7! sing a coss 
As x becomes smaller, the function and its derivative oscillate with in- 
creasing frequency and amplitude. We have calculated the derivatives at 
100 points in the interval [0.6,6.0] x 1074 and computed the relative er- 
ror e = (f’ — f)/|f'| for the four methods. In this interval the function 
attains values of the order of 104 and its derivative that of 1012. We find 
that for CS, FAD and RAD, the relative error is no more than 5.0 x 10716, 
which compares favourably with the machine precision 7 = 2.2 x 10716. By 
contrast, the relative error for the FD method ranges between 107° and 
100.0. 


6.2. Uncertainty in interferometric length measurements 


The estimate of displacement d derived from the interferometric measure- 
ment of optical path length D is given by d = D/n(t,p,v) where n is the 
bulk refractive index of the air along the light path and depends on temper- 
ature t, pressure p and humidity as measured by the partial vapour pressure 
v. In practice, the partial vapour pressure is often derived as a function 
v(s) of the dewpoint temperature s. Given uncertainties associated with 
measurements D, t, p and s, the uncertainty associated with the geometric 
displacement d is calculated as in (4), section 2.2, and therefore involves 


178 


calculating the partial derivatives of d = d(D,t,p,s). The revised Edlén 
formula? for n(t, p,v), for a fixed wavelength, is of the form 


1+ (c2 — sie) ies 


npo) = 1+ ep} eae 


while that for v(s) is of the form’? 


v(s) = exp {6/3 + c7 + eg5 + c95* + Cio log 5} , 8s =84+c1, 


where c = (cy,...,¢11)? are fixed constants. (If the wavelength is re- 
garded as a variable, then cı = cı (À) and cs = cs(A) are nonlinear functions 
of wavelength.) With these formulae, the evaluation of d = d( D, t, p, s) has 
been encoded in software implementations in Matlab and Fortran 90, lan- 
guages that support complex arithmetic. The evaluation of the sensitivity 
coefficients is therefore very simple using the complex step method. 


7. Concluding remarks 


In this paper, we have been concerned with methods for calculating the 
derivatives of functions. In particular, we have been interested in finite 
differences (FD), the complex step method (CS) and forward and reverse 
automatic differentiation (FAD and RAD). We have analyzed these meth- 
ods in terms of accuracy, ease of implementation and efficiency with respect 
to speed and memory. We have found: 1) FD is the easiest to implement 
and use but is inaccurate, losing as much as half the available figures. FD 
is also inefficient for problems involving a large numbers of variables. 2) CS 
is easy to implement and use, accurate for first derivatives but can be in- 
efficient for large problems. 3) FAD produces accurate first (and higher) 
derivatives, is straightforward to implement and use but can be inefficient 
for problems with a large number of independent variables. 4) RAD pro- 
duces accurate first (and higher) derivatives and its memory requirements 
and computing times are determined by the complexity of the function, 
rather than the number of independent variables. It is quite straightfor- 
ward to use but its implementation requires significant effort. More details 
can be found in Boudjemaa et al.”. 
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USAGE OF NON-CENTRAL PROBABILITY DISTRIBUTIONS 
FOR DATA ANALYSIS IN METROLOGY 


A. CHUNOVKINA 


D.I.Mendeleyev Institute for Metrology, Moskovsky pr.19, 
St. Petersburg, 190005, Russia 
E-mail: A.G.Chunovkina@vniim.ru 


The methods of statistical hypotheses testing are widely used for the data analysis in 
metrology. As a rule they are based on normal distribution, (distribution, y7—distribution 
and F-distribution. The criterion of hypotheses testing is usually characterized only by the 
level of significance. Such characteristic as a criterion power is not actually used in 
metrological practice. The paper discusses the problems of using the corresponding non- 
central distributions for the evaluation of the criterion power as well as actual level of 
significance in the present of systematic biases in measurement results. The examples of 
testing the measurand value, the difference between measurement results, the consistency 
of the data and the relations’ model are considered. 


1. Introduction 


Statistical methods for testing hypotheses are widely used in metrology in case, 
when a certain decision should be made on the basis of measurement results 
obtained. A few examples are given below: 

e Testing the deviation of the value of the parameter from its 


specified standard value (testing reference values of standard 
materials), 


e Testing the significance of systematic differences between 
measurement results obtained by various measurement procedures, 
e Testing the consistency of experimental data with a model 
suggested (model of functional relations), 
e Testing the homogeneity of the data, received from various sources 
(testing the difference between the means and the variances), 
e Others 
A conventional Neyman and Pearson approach for testing hypotheses is 
applied [1]. If a measurement result, x, is in the region of the values, which are 
consistent with the null hypothesis, the null hypothesis H, is accepted. 
Otherwise, if the measurement result is in critical region Q, - Hg is rejected. 
Q; =O\Q,, where Q- is a region of all possible measurement results and 
©, is a region of values consistent with the null hypothesis. As a rule, 
percentiles for normal, y’, ¢ and 7 distributions are usually used as critical 
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values at the determination of the region of values consistent with the Hp. The 
critical region is determined so as that the probability of incorrect rejection of 
the null hypothesis (type I error probability or a criterion level of significance) 
would be small enough (usually accepted values are 0,05 or 0,01): 


P(xeQ,|H,)=a. (1) 


Here the Neyman and Pearson interpretation of the level of significance 
(significance level) should not be mixed with the significance level of observed 
values (measurement results). The latter interpretation will be also used in this 
paper in 2:5. 

In using statistical methods for testing hypotheses in metrology, at least two 
aspects of their correct application should be considered: 

1. The acceptance of the null hypothesis, in fact, means an acceptance of the 
related model. It can be a specified value of a parameter or a relation model 
between variables and other. The model is chosen as adequate to the data. 
Then it can be applied to data processing or to measurement design. 
However, strictly speaking, the question on the model accuracy remains 
open, as the acceptance of a model does not yet mean its “trueness”. 
Different models can be accepted on the basis of the same criterion and on 
the basis of the same measurement results. Their subsequent application 
leads to the results, and the deviations between them can exceed the 
associated uncertainties assigned to these results. This is easily explained by 
the fact that the uncertainty of the measurement results is evaluated within 
the frame of the model approved. Since it is difficult to characterize 
quantitatively the uncertainty of the model accepted, it is important to 
search for another way to characterize the validity of the model used. One 
of the possible ways is to estimate the probability of incorrect approval of 
the null hypothesis related to the model considered. In other words the 
criterion power should be estimated. 

2. The application of statistical methods of testing hypotheses, as well as of 
statistical estimations of parameters, in metrology requires the correct 
handling and treatment of the systematic errors of measurement results. The 
systematic biases in measurement results should be taken into account when 
level of significance and criterion power are actually estimated, as well as 
when the judgment on the measurement results is made. 

The paper considers these aspects applied to some concrete examples. 
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2. Example A. Testing the null hypothesis about equality of the 
measurand value and a specified standard value 


The following task is considered. Testing the null hypothesis Hy : a = ag , where 


a — actual measurand value, a) — specified standard value, is considered. It’s 
assumed that the measurement results X,,...,.X,, come from normal distribution 


with parameters N (a, a”) . Under Hp the following test statistic 


T = X -an ; (2) 
S 
where X ye 4 S? -— (x i A y , is distributed as t-Student 
n n— 


distribution with n-1 degrees of freedom. Hence, the region of values, which are 
consistent with Ho, is defined by the equation: 


il< toos (0-1), 2de t= Sas, (3) 


According to the above, the following questions should be answered: 

1. What is the probability to reject Ho, when it is not true? How should the 
alternative hypothesis H, be formulated? 

2. How many repeated measurements {n} should be carried out to reject Ho 
with the given probability (power) when the actual deviation of the 
measurand value from the specified value is more than Aa ? 

3. What are the actual type I and type II errors probabilities when the 
conventional well-known criterion with the significance levels assigned in 
advance are used in the presence of systematic biases in measurement 
results? 

4. Should any correction be made according to the test results? Or should the 
original experimental deviation from the specified value be taken used for 
the uncertainty evaluation? — 


2.1. Estimation of the criterion power 


The answer to the first previous question requires the formulation of the 
alternative hypothesis H,. To do this, we should use some additional 
information, which is usually available in practice. For example, when testing 
the deviation from the standard value (in quality control inspection or other), the 
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limit of the permitted deviations is usually specified. Denote the limit of the 
neglected deviations as Ady . Then the alternative can be formulated as follows: 
H, :|a—a,| > Aa, (4) 


The maximum Type II error probability (the risk of incorrect concluding that Ho 
is true) is realized for actual measurand values equal to a =a) + Aa: 


p= max P(H.|H,)= Phe < taos (2-1 Ja = ay + Ady (5) 
Consequently the criterion power equals 1- 8 


—_—— is 
S 
distributed as non-central t-distribution with parameter of non-centrality 


Aa, Vn 


Under H, (a =a, +Aa, or a=a,—Aa, ) the test statistics T = 


O 


To use the standard tables for criterion power values it is necessary to 
specify: n — degree of freedom, æ — level of significance, c — parameter of non- 
centrality. 

The increase of the above parameters results in the increase of the criterion 
power. First of all, it concerns the dependence upon the parameter of non- 
centrality. Coming back to the example considered, it should be stressed that, in 
the case given, the parameter of non-centrality itself is an increasing function of 
the number of repeated observations. 


2.2. The choice of a number of the repeated observations 


The standard tables for the criterion power values can be also used for a rational 
choice of a number of repeated observations n in case when level of significance 
and power of the criterion are specified. The alternative hypothesis H, is 
determined as above: H; : |a — a| Z Aag . 


The procedure of choosing n is valid when a random dispersion of the 
measurement results dominates their systematic biases. With the increasing of 
the number of repeated observations, the systematic and random effects become 
comparable. Further increasing the number of observations does not lead to an 
increase of criterion power, as it could be expected. 
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2.3. Testing the systematic differences between measurement results 


It is important to outline special cases, where the application of the criterion is 

correct: 

1. testing the changes of the parameter under control. Here, the same 
measurement procedure is used, so systematic biases do not influence the 
differences between measurement results and should not be taken into 
account. For example, this task occurs when the stability of some process 1s 
under control. In the given case, Hy relates to the assumption that the 
parameter value remains the same and consequently the differences 
between measurement results are caused only by the reproducibility of the 
measurement procedure used. 

2. testing the systematic difference between measurement results obtained by 
different measurement procedures. Here it’s known that the measurand 
value is the same for two different measurements. In the given case, Ho 
relates to the assumption that there is no systematic bias between the two 
measurement results consequently the distinction between them is caused 
only by their random dispersion. 

The null hypothesis is formulated as follows: Ho : a; = a>, where a; „az — 
the actual measurand values for the two measurements. (In a more general case, 
testing the null hypothesis Mo. a; = a; =...= ay is considered and F-test statistic 
is used). Under Ay the difference between two measurement results should 
satisfy the following equation: 


Ix, x, | St, (n +M, ~2) ee RET he —1)S; +(n, —1)S; 





nn (n +n, 
= 1 n 1 m EE. 
Xi m S “Bae (x,,-%,) (6) 
E 1 ny ] ny 2353 
a ocr clare? ria) 


where t,(n, +”, —2)-~ percentile of -distribution with n,+n,-—2 degrees of 
freedom and a level of significance a. 


For calculation of the criterion power, a non-central tdistribution with 


n,+n,—2 degrees of freedom is used, with a parameter of non-centrality 
Aa n : cate ; ; 
c= ae , where Aa, is the limit of the permitted differences as 


determined above. 
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2.4. The way of the handling systematic biases in measurement result 


When the conventional well-known criteria are used in metrological practice in 
the presence of systematic biases in measurement results (which are not taken 
into account) the actual type I error probabilities differ from their expected 
values. Generally, systematic biases in measurement results lead to an 
increasing of the level of significance and to a decrease of the criterion power, 
comparing with their values in absence of systematic biases. It is obvious, that 
the null hypothesis and the alternative one cannot be discriminated when Aa, is 
less than the level of the systematic bias in the measurement results or of the 
difference of systematic biases depending on the task considered. It is suggested 
to take into account the systematic biases for the determination of the alternative 
hypothesis by putting Aa, = 0 , where @ is the limits of the systematic biases: 


Hy: Aa, = 0 


7 
H,: Aa, = @ 0 


Under the Ho hypothesis, the differences between the expectation of the 
measurement results and the specified standard value can lay within the limits 
(- 0, 6). So the level of significance (type I error probability) is suggested to be 
characterized by the average value within these limits: 


a= |p li| > ta(n- Dja = a, + x)dx (8) 
-0 


Here, the uniform distribution of systematic biases is assumed as in 
Bayesian approach for testing hypotheses. To calculate 
P(|i|>t,(n-1)|a= a, +x) the standard tables for the z-criterion power values 


(based on non-central distribution with c= * Jn ) are used. Hence, under H1, 
Oo 


the test statistics considered above has a non-central t-distribution with 


0 
parameter of non-centrality equal to c=—WVJn: 
o 


B= Pli <t, (n—Dla = a +0) 


If the procedure for the determination of the alternative hypothesis does not 
satisfy the user requirements, a more precise measurement procedure should be 
used for testing the hypotheses. 
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2.5. Has the original values to be retained or the measurement 
uncertainty has to be increased in accordance with the test results? 


Let us consider the following situation. A specified standard value a, is used 


for the subsequent calculations and, consequently, it influences the accuracy of 
the final result. A natural question arises: is it necessary to take into account the 
experimental deviations of the measurement result from the specified value 
when estimating the quality of final results or not? If yes, how should it be 
made? As a matter of fact, it is a question about the "validity" (‘trueness”) of the 
model, which is accepted by testing the related hypotheses. 


Within the frame of the approach considered (Neyman and Pearson approach 
for testing hypotheses), it is difficult to answer the above question. According to 
this approach the null hypothesis is accepted or rejected depending on whether 
the measurement result is in the critical area or not. Therefore, the actual 
magnitude of the deviation from standard value is not taken into account. To 
account for the deviation magnitude we need another interpretation of the 
significance level. Here it’s interpreted as significance level of the data received 
(opposite to the above interpretation of criterion significance level) [2]. Here 
significance level characterizes the consistency (agreement) of the particular 
data obtained with the hypothesis tested. Note, in the given interpretation type I 
and type II errors probabilities are not used because no alternatives are 
considered. So the following procedure is suggested. Under Hp the probability 
(X — Ay Wn 
S 


of the event that test statistics T = exceeds the actually observed 





value, should be calculated: z = P(r a i Vn |H | ; 


Depending on the 7 values the following conventional procedure is 
applied: 


Interpretations 













Strong argument for Ho => no correction 
Some doubts in Ho => further analysis 


m = 0,02 Strong argument against Ho > 
correction 
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3. Example B. Testing the hypothesis about the model between 
variables 


For testing the hypothesis about a model, the following test statistics is usually 
used (for simplicity a linear relations is considered): 


N 
N(n-In>(9, - 9, 


: y? = = = (9) 
W - 2), ET i 
i=] j= 
where ix, Vij ji =1,...,N,j/=,...,2 are experimental data 
ee de on tase Te 
i i i E x are ay = sa 
i a 


eo Ee- 


It is assumed that data pertain to normal distributions with expectations equal to 


2 


measurand values and variances T> o; . The case of controlled variable is 


considered so the least square estimation of linear relations is valid [3], [4], [5]. 
If 


v <s Foos Vi: V2)) ; (10) 


then the null hypothesis for linear relations is accepted. Here F,,,(v;,V,) is the 
95% percentile of the F-distribution, with v, = N —2,v, = N(n—1) degree of 
freedom. 


The application of the F-criterion is valid when the independent variable x is 
known without errors. In practice, there are always errors also in the 
independent variable. Hence, the application of the F-criterion, strictly speaking, 
is incorrect. The test statistics Eq. (9) is distributed as a non-central F- 
2 
eT ee : O 
distribution with parameter a=b(N-1)—= and degrees of freedom (N-2, 
O 


y 
n 


N(n-1)) . The actual type I error probability for the criterion considered can be 
estimated using standard tables for F-criterion power values. To use these tables 





188 





it is necessary to determine a parameter g = and degrees of freedom. 


v + 
O ; 
~ . Note that the increase of the 





For the case considered @ equals to g= Vbn 
oj 
y 
number of repeated observation results in an increasing the probability of 
rejection of null hypothesis when it is true (type I error probability). For 
example, for ø =1, N = 4 the actual type I error probability is in a range from 


0,2 to 0,3 (depending on n) while the assigned value is 0,05. 


4. Conclusions 


The general problem of application of statistical methods for testing hypotheses 
in metrology has been considered on concrete examples. The following 
questions have been outlined: 
1. Interpretation of judgments based on the results of testing hypotheses. 
Characterization of the quality of the model related to the hypothesis tested. 
2. Account for the systematic biases in measurement results when formulating 
an alternative hypothesis 
3. Using non-central distribution with: 
e estimate of the criterion power 
e estimate of actual level of significance when conventional test 
statistics are used in the presence of systematic biases. 
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We present a new software product (DFM Calibration of Weights) developed 
at Danish Fundamental Metrology (DFM), which has been implemented in 
mass measurements. This product is used for designing the mass 
measurements, acquiring measurement data, data analysis, and automatic 
generation of calibration certificates. Here we will focus on the data analysis, 
which employs a general method of least squares to calculate mass estimates 
and the corresponding uncertainties. This method provides a major 
simplification in the uncertainty calculations in mass measurements, and it 
allows a complete analysis of all measurement data in a single step. In addition, 
we present some of the techniques used for validation of the new method of 
analysis. 


1. Introduction 


The primary mass standard in Denmark is the national kilogram prototype no. 
48. It is made of the same platinum-iridium alloy (90% Pt — 10% Ir) as the 
international kilogram prototype. Traceability to the international prototype, 
which is placed at the Bureau International des Poids et Mesures (BIPM) in 
Paris, is obtained by calibration of the Danish prototype against the prototype 
copies maintained at BIPM. The prototype no. 48 was cast in 1937-38 in France, 
and it was delivered by BIPM to Denmark in 1950. The Danish prototype is 
kept at DFM in a controlled environment, and it is used in calibration of DFM’s 
stainless steel weights, which are in turn used as reference weights in client 
calibrations. 

The process of mass calibration is quite complex, and the data acquisition 
and the analysis of the weighing results are most easily done by the use of 
computers. In the following we present the new software implemented for mass 
calibrations at DFM together with an introduction to the method for data 
analysis and some of the validation results. 


” Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
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2. The principle of subdivision and multiplication 


We will describe the subdivision procedure by considering an example where a 
set of weights (1kg, 500g, 200g, 200g*, 100g, 100g*) with unknown masses is 
calibrated. First we extend the weight set by adding a reference weight (R1lkg) 
with known mass. A measurement sequence is then established, where each line 
in the sequence provides an apparent mass difference of two weight 
combinations of the same nominal mass. A mass comparator is used for these 
mass measurements. The mass comparator can measure the mass difference 
between two weights of the same nominal mass with a very high resolution. 
Different mass comparators are often used in different ranges of mass values. A 
possible sequence of measurements is illustrated in the following equation, 
where the measured mass differences are written in matrix form: 


N 1 -1 0 0 O0 0 O)\/mre 

2; |1 0 -1 -i -1 0 -1]I,, 

Ahi |i 0 -1 -1 -1 -1 Ors 

Al) 10 O0 1 -1 -1 0 =1 {| ™sovg (1) 
Al,|='!0 0 1 -1 -1 -I O || Moog |,0r AI = Xm 

A| 0 0 O 1 -1 0 Offa 

AI 0 0 0 1 0 -i -1| 

a| {9 0 0 O 1i -1 -1 || "hoe 

AL 0 0 0 0 0 -tł E j Moog 


Kel 


The design matrix X on the right specifies the weights that enter in each 
comparison (each line in X). The vector m consists of the apparent masses of the 
weights (with only mpixg known in advance). The vector AZ holds the apparent 
mass differences calculated from the readings of the mass comparator. Each 
comparison is typically repeated 12 times or more in order to reduce the 
Statistical uncertainty and to ensure repeatability in each measurement. 

Eq. (1) is an over-determined set of linear equations, which do not have a 
solution because of the imperfect measurement of AJ. However, by a linear least 
squares fit in which the known value of the reference mass is used as a 
constraint[1,2], it is possible to derive the ‘best’ m-vector, which will minimize 
the difference between the left- and right-hand side of Eq. (1) in the l} norm. 

It is of course possible to extend the design matrix to include weights of 
even smaller masses or weights with masses above 1 kg. However, it is often 
convenient to use separate design matrices for each decade of nominal mass 
values. This is normally accomplished by introducing combinations of weights 
from another decade as a single ‘sum-weight’. As an example, weights of 
nominal masses 50g, 20g, 20g* and 10g can be grouped as an S100g weight and 
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entered into the 1 kg -100 g decade. Analysis of the 1 kg —100 g decade gives 
the mass value for the $100g sum, which will subsequently be used as the 
reference value in the 100 g —10 g decade. 


3. The real measurement model 


In the real world, however, the volumes and shapes of the weight combinations 
are not identical, and the measurements need correction for air buoyancy as well 
as for the gravity gradient. These effects are included in the following weighing 
equation for the 7’th line in the measurement sequence: 


Jahli -E a[1 Ezy fn) -ar,) (2) 
j 


Here xy are the elements of the design matrix X, g is the gravity acceleration at a 
certain reference height, and g’ is the vertical gradient of g. z;; is the vertical 
distance between the reference height and the center of mass position (CM 
position) of weight j. z,; depends on i, because the weight j may be stacked on 
top of other weights in design line i. V; is the volume of weight j, and f} is a 
scale factor correction for the comparator readings. The density of air (a,) is 
calculated from measurements of the temperature (¢,), pressure (p,;), dew-point 
temperature (fd;) and CO, contents (xco> ;): 


a; = hlt, , Pistd;, XC02; ) (3) 


The formula / is the 1981/91 equation for the determination of the density of 
moist air recommended by the Comité International des Poids et Mesures 
(CIPM) [3,4]. This equation is not perfect, and an uncertainty is associated with 
this formula. Furthermore, each of the measurement instruments has a 
calibration curve described by a set of parameters with associated uncertainties 
and correlation coefficients. Thus, a very large number of measurement data and 
calibration parameters enter into the calculation of the unknown masses, and a 
complete uncertainty calculation according to the ZSO Guide to the Expression 
of Uncertainty in Measurement (GUM) becomes quite complex. The traditional 
method, which is used at a number of National Metrology Institutes, is based on 
the least squares solution presented in section 2 together with a number of 
approximations and assumptions in order to apply the GUM method of 
uncertainty calculation. Furthermore, each decade of mass values and the 
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Table 1. A list of the different parameters (¢ and J) and constraints (f), which enters in 
a typical calibration of a weight set with 43 weights. The conventional mass of a 
weight is defined by OIML [7]. CM: center of mass. 












air density, instr. calib. 
values mass values 
(or volumes sum weights 
(or densities conv. masses 
43 Equations for 
coefficients 


CM info 


for reference weights density (or volume) 

18 Calibration parameters, ae 
e.g. 7, 

7 Temperatures for Lo 


thermal expansion 


1 Correction factor in 
air density equation 


337 parameters 182 parameters 237 constraints 








corresponding part of the design matrix are analyzed separately. The results for 
e.g. sum weights in one decade are then used as the “reference mass” in the next 
decade. This method was previously used at DFM. 


4. The general least squares solution 


In DFM Calibration of Weights we have used a general least squares method for 
the evaluation of mass estimates and uncertainties. We describe all input 
parameters, for which prior estimates and associated uncertainties are available, 
as elements of a vector z. The elements in z include the comparator readings in 
AF, information about weight volumes, heights and thermal expansion 
coefficients, and calibration parameters of the measurement instruments. The 
covariance matrix Æ describes the known uncertainties and possible correlations 
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for the input parameters z. A vector 2 holds the parameters for which no prior 
knowledge is available, i.e. the quantities that we wish to determine (primarily 
masses and conventional masses [7]). An improved estimate for z (named ¢) as 
well as an estimate for B is obtained by minimizing the weighted chi-square 
function with respect to ¢: 


x (G32) =(Z-SY E` (z-¢) (4) 


y? is minimized under the constraints induced by the weighing equations, the air 
density formula, instrument calibrations etc. — all represented as a vector 
function f fulfilling: 


f(B,¢)=0 (5) 


A simple method to solve this problem is presented in [5,6], where (improved) 
output estimates Gand # as well as associated uncertainties and correlations are 
obtained as the output. If f is a nonlinear function the solution is obtained by 
iterative calculations. 


5. DFM Calibration of Weights 


The DFM Calibration of Weights software consists of two modules. One 
module is based on a Microsoft Excel template with Visual Basic for 
Applications code. This module is used to specify the weights that enter into the 
calibration and to construct the weighing design X. This module is also used to 
automatically generate the calibration certificate as a Microsoft Word document 
based on the analyzed calibration data. Finally, the Excel module automatically 
updates the calibration database for DFM’s own weights and for the client 
weights. The other module is written in LabWindows (C code) and it controls 
the weighing sequence and data acquisition from mass comparators and 
instruments for monitoring the climate. The LabWindows module also includes 
the algorithm for data analysis using the approach described in section 4. 

As an example, we consider a typical calibration involving 43 mass 
standards, which are calibrated using 3 different mass comparators. The 
quantities collected in ¢ and £ are described in Table 1, and we note that a total 
of 519 parameters are involved in the calculation. The 237 equations defining 
the constraints are also listed in Table 1. Note that not all parameters and 
equations from Table | are included in the simplified description in section 3 
and 4. The analysis converges in 4 iterations, which takes about 2 minutes on a 
500 MHz Pentium PC. The uncertainties from the calculation itself (e.g. 
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machine precision limitations on numerical derivatives, matrix operations etc.) 
on the derived mass values are all below 10” relative to the calculated mass 
uncertainties. 

The advantage of the new method is that all decades in the calibration are 
analyzed in one step, and the consequences of changing one or more input 
parameters or uncertainties can be analyzed quickly. Furthermore, correlations 
between input parameters (typically the mass and volume of the reference 
weights) are easily included and no approximations of the involved equations 
are required. 


6. Validation of DFM Calibration of Weights 


DFM is accredited according to ISO 17025, and we must therefore fulfill the 
requirements on software validation given in this standard. DFM Calibration of 
Weights is validated using the Nordtest method and template for software 
validation [8]. DFM participated in the development of this validation method, 
and it is designed to comply with both the ISO 17025 and ISO 9001 
requirements. 

The Nordtest method is based on the software life cycle model, and it can be 
used both as a development tool and a validation tool. In Fig. 1 we show the 
different phases in the life cycle. The most important part of the validation is the 
testing part where the software.is tested against the specified requirements. In 
the following we focus on the tests that ensure that DFM Calibration of Weights 
performs as required. 

Careful inspection of the source code and dynamic testing was applied 
during the development phase. Such structural testing provides confidence that 
each module or function works correctly, but more general tests are required to 
see how the individual modules interact and to test the integration of all 
software modules with the hardware and peripheral units (i.e. mass comparators 
and environmental instruments). 

To achieve correct calibration results, we must ensure that the data are 
transferred correctly from the different peripheral instruments to the computer 
and into the electronic files where raw data are stored. This is ensured by direct 
observation of values displayed on the instruments and by comparison to the 
values stored in the files. Furthermore, correct analysis of raw data must be 
checked, and finally it is checked that the analyzed data are transferred correctly 
into the final calibration certificate. The validation of the data analysis is 
difficult, since the employed method for the analysis is quite complex. Typically 
validations of calculations involve parallel testing, where manual calculations or 
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Fig 1. Schematic overview of the software life cycle model used in the Nordtest 
validation method. 


alternative (but equivalent) software solutions are used to reproduce the results. 


Instead of direct parallel testing, we have used the following tests to ensure 
correct data analysis: 


Comparison of mass values obtained with new software to previous 
mass values for DFM weights 
Analysis of old raw data using the new software 
Analysis of a constructed test data set 


In addition, we continuously monitor the calibration results by including a few 
of DFM’s own weights as check standards whenever client weights are 
calibrated. By comparing the derived masses for the check standards with their 
calibration history, we can detect with a high probability if an error has 
occurred. In addition, the value for the minimized x? value provides information 


on the consistency of the measurement results [5,6]. 
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Fig.2. Calibration history of two DFM weights. The vertical scale is the difference 
between the conventional mass and the nominal mass. The new software was 
implemented in September 2002. 


In Fig. 2 we show the calibration history of two weights from DFM’s 
weight set. The error bars on the measurements are the standard uncertainties 
(one standard deviation) calculated in the data analysis. This uncertainty 
includes the contribution from the uncertainty of the 1 kg reference weight. 
However, this contribution is only about 10° in relative units, and for the 
weights in Fig. 2 this contribution can be neglected. Thus, the error bars in Fig. 
2 reflect the uncertainty due to the weighing process. The scatter in the data for 
different measurements represents the repeatability of the measurements. The 
level of the repeatability is probably caused by unstable masses due to 
adsorption of dirt, water etc. on the weight surfaces. The lines represent the 
average mass value and the Io confidence intervals from a least squares fit to a 
model describing drift and random variation. It is clear that no significant 
changes in mass values and calibration uncertainties are detected after the new 
software system is implemented. 

We have also re-analyzed the raw data from an old calibration task made 
with the software previously in use, which was based on the analysis described 
in section 2 and 3. These raw data consist of weighing results (AJ), values for 
environmental parameters and information on weights such as volumes, heights 
etc. The format of the raw data is adjusted to the format used for the new 
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0 10 20 30 40 
Weight No (1kg > Img) 


Fig. 3. Comparison of the calculated masses using the old and the new software/method 
for analysis. Both analyses use the same set of raw data. The mass differences are 
normalized to the calculated uncertainty. 


software, and the data are reanalyzed using the new software with the method 
described in section 4. The results from the old analysis (i.e. original analysis 
with old software) and the new analysis are shown in Fig. 3. It is seen that the 
two software products using two different methods for the analysis agree within 
the calculated uncertainty. The increased scatter at larger weight numbers can be 
explained by insufficient resolution in the recorded maa values at small mass 
values. Besides looking at the mass values, we have also investigated the 
uncertainties obtained with the old and new software. 

The final test is based on the generation of an artificial test data set. 
Although the calculation of mass values from the weighing results is quite 
difficult in general, the reverse calculation of weighing results from known 
masses is in fact simple. Thus, we assume that the masses of all the weights are 
known for a simple design with 9 weights. From the assumed environmental 
values and the known weight volumes, heights etc. we calculate the 
corresponding weighing results (AJ) using Eq. (2) and (3). These calculated 
weighing results will correspond to a “perfect fit”, and the subsequent analysis 
is therefore independent of the uncertainty on the weighing results. In Fig. 4 we 
show the comparison between the assumed input masses and the output masses 
from the analysis, and the agreement is quite good. The relative difference 
between input and output masses is at the 10°'* level simply due to the number 
of digits (12) in the output of the analysis program. For comparison, the 
uncertainty in the mass value of the national kilogram prototype is = 3x10”. 
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Fig. 4. Comparison of input masses used for generating the test data set and the output 
masses from the analysis of the test data. 


7. 


Conclusion 


In conclusion, we have described the principles of mass calibration and 
introduced the general method of least squares, which has been implemented in 
the new mass calibration software. The validation of the software has been 
discussed and results have been presented. 
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The paper describes a simple approach to software design in which the ‘Law of 
propagation of uncertainty’ is used to obtain measurement results that include a 
statement of uncertainty, as described in the Guide to the Expression of Uncer- 
tainty in Measurement (ISO, Geneva, 1995). The technique can be used directly for 
measurement uncertainty calculations, but is of particular interest when applied 
to the design of instrumentation systems. It supports modularity and extensi- 
bility, which are key requirements of modern instrumentation, without imposing 
an additional performance burden. The technique automates the evaluation and 
propagation of components of uncertainty in an arbitrary network of modular mea- 
surement components. 


1. Introduction 


This paper describes a design technique for software that applies the ‘Law 
of propagation of uncertainty’, as recommended in the Guide to the Ex- 
pression of Uncertainty in Measurement (henceforth, the Guide)’. It au- 
tomates the task of evaluating and propagating components of uncertainty 
and produces software that is both modular and extensible, both of which 
are requirements for modern instrumentation. 

The technique can be applied to instrument and smart-sensor firmware, 
or to higher level software coordinating measurement systems for specific 
measurements. It is both efficient and simple to implement. Its most strik- 
ing feature is that measurement equations are differentiated automatically, 
allowing uncertainty to be propagated with only a direct statement of mea- 
surement equations. 

The next section shows that the Guide’s formulation of the uncertainty 
propagation law can be easily applied to modular measurement systems. 
By decomposing the evaluation of a measurement value and the associated 
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uncertainty into an arbitrary set of intermediate calculations, we show that 
each step can be considered as a modular component of a system. Section 3 
uses this formulation to identify the basic requirements of modular software 
components capable of encapsulating the calculations. 


2. Calculation of value and uncertainty in modular systems 
2.1. The measurement function 


According to the Guide, a measurement may be modelled by a single ‘mea- 
surement function’, f, interpreted 


“ ,. as that function which contains every quantity, including all 


corrections and correction factors, that can contribute a significant 
component of uncertainty to the measurement result” ! . 


The inputs to this function are uncertain quantities so a measurement result 
is estimated as 


Lm Aon BE ; (1) 


where f is evaluated at the estimated mean values of the inputs, 
Ti, T2,...3 TL. 

For example, electrical power, P, dissipated in a resistor may be esti- 
mated by measuring the potential difference across the resistor terminals, 
V, and the resistance, R. The measurement function for this is 

y2 
where V and R are the two inputs. 


2.2. Decomposing the measurement function 


In the Guide’s approach, the function f represents the entire measurement 
procedure. In complex measurement systems it will be difficult to state is 
function explicitly. However, f can be conveniently decomposed into a set 
of simpler functions of the form 


zj = f;(A;) (3) 
where ‘f’ identifies the decomposition step, z; is the intermediate value at 
the step and A; is the set of direct inputs to the function f;. * These 


*The labels, 7, are assigned so that 7 > k, where k is the subscript of any member of the 
set Aj. 
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functions can be used to evaluate the measurement result by using a simple 
iterative algorithm. If there are m — | decomposition steps and / direct 
inputs to f (called ‘system inputs’), then 


x; = fj(Aj), for j7=C+1,---,m (4) 


The decomposition steps of a measurement function are referred to from 
now on as ‘modules’ because we will show that both value and uncertainty 
calculations can actually be encapsulated at each step and thought of as a 
discrete component of a measurement. The algorithm above can be inter- 
preted as a distributed calculation over the modules comprising a system. 

Referring to the simple power measurement example, a possible decom- 
position is: 


tm =V (5) 
r2 = R (6) 
r3 = r? (7) 
PH oi = a/m (8) 


There are two system inputs, x; and 22, so l = 2, and two more modules 
that decompose equation (2), so m = 4. 


2.3. Standard uncertainty and components of uncertainty 


The term ‘standard uncertainty’, denoted u(z;), is an estimate of the stan- 
dard deviation of the random variable associated with the quantity x;. The 
standard uncertainty in the measurement result, u(x), often referred to 
as the ‘combined standard uncertainty’, depends on the input uncertainties 
u(x1),u(%2),..., u(x), their possible correlations and on the function f. 

The calculation of u(x) is simplified by defining components of uncer- 
tainty, which are related to the sensitivity of the measurement function to 
its inputs. The ‘component of uncertainty in £m due to uncertainty in an 
input quantity x,’ is defined as” 


ð 
Ui(lm) = A ulzi), (9) 
where the partial derivative is evaluated at the values of £1, £2,..., £1. 


>The Guide defines a component of uncertainty as the modulus of u;(am) here. 
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The combined standard uncertainty can be evaluated from the ui(zm) 

and the associated correlation coefficients 
1/2 
1 ıl 
u(tm) = | >_> ui(em) (xi, Bj) 4y(2m) (10) 
i=1 j=1 

where the sums are over the l system inputs and r(2;, z;) is the correlation 
coefficient between inputs x; and zj. 

In terms of the power example, the components of uncertainty are 


uv(P) = u (z4) = 25 u(V) (11) 
ur(P) = ua(z4) = -73 u( R), (12) 


and if the measurements of V and R are uncorrelated, the combined stan- 
dard uncertainty in the power measurement is 


— u (R) , (13) 


2.4. Decomposition of the uncertainty components 


The key point made in this paper is that evaluation of equation (9) can be 
decomposed into steps to obtain an algorithm of the same structure as (4). 
This makes it possible to encapsulate both the calculation of components 
of uncertainty, and the calculation of value, within modules. Each module 
need only evaluate the components of uncertainty of its own and the results 
will propagate through the system correctly. 

Equation (9) can be decomposed by applying the chain rule for partial 
differentiation to the decomposition functions in algorithm (3). A compo- 
nent of uncertainty, for any given i, can then be calculated iteratively: 

At; = ` nfr) , for j=l +1, m. (14) 


Tk CA; 


This algorithm propagates the sensitivity of each module function, f;, to a 
module input zg, weighted by the component of uncertainty u;(z,) in that 
input’s value due to uncertainty in the system input 2;.° 

This algorithm, together with equation (4), show that the uncertainty 
component and value calculations of a step can be encapsulated in one 


Equation (14) implies that u;(2;) = u(x;). This is compatible with (9), when zm is 
replaced by xz; and f by fj. 
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module. The algorithms effectively distribute both value and uncertainty 
calculations over a network of interconnected modules. 

In the power example, referring to equations (5)—(8), the two uncertainty 
functions (j = 3 and j = 4), for arbitrary i, are 


ui(£3) = oni uj(21) = 271 u(x) (15) 
Ox, 
ð O 

ui(r4) = ui(t2) + a ui (£3) 


= ne u;(X2) F = uj (x3) 3 (16) 
Ta T2 
In these equations, together with equations (7) and (8), a module need only 
obtain information from its immediate inputs (the arguments to f;). So 
the j = 3 module (equations (7) and (15)) only refers to xı (the voltage 
V) and the j = 4 module (equations (8) and (16)) only refers to x3 (the 
voltage squared, V°) and zz (the resistance, R). 


3. A design pattern for uncertainty software 


A design pattern describes a generic solution to a broad class of problem. 
Patterns are widely used in software development and are an effective way 
for software developers to communicate and share successful techniques. 
They provide flexibility (e.g. allowing implementation in various program- 
ming languages) while concisely describing a solution strategy. The GUM 
Tree design pattern, shows how to incorporate, into the design of measure- 
ment systems, the mathematics underlying uncertainty propagation. 

The GUM Tree is closely related to the more general and well docu- 
mented ‘Interpreter’ pattern 7. There are two important software classes in 
the GUM Tree: a Module interface class and a Context class. The former is 
a base class for system modules. It constrains a module’s implementation 
class to meet the basic requirements necessary for uncertainty propagation. 
The Context class encapsulates calculations that must be handled outside 
of modules and data associated with the overall system. 


3.1. The requirements of a module interface 


To evaluate algorithms (4) and (14) the jt? module must provide two pieces 
of information: 


e an output value, zj; 
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è a component of uncertainty in the (module) output due to uncer- 
tainty in the i" system input, u;(z;). 


Every module’s software interface should include functions that return these 
numbers.? In C++ pseudo code, such an interface can be declared as a pure 
abstract base class® 


class ModuleInterface { 
public: 
virtual double value() = 0; 
virtual double uComponent (ModuleInterface& i) = 0; 


i 


3.2. The Context 


A separate class is required, in addition to the library of module classes, 
to encapsulate global calculations and manage global data. For example, 
the calculation of combined standard uncertainty, equation (10), has to 
be performed on the system modules and needs access to correlation co- 
efficients. An entity called the ‘Context’ can be used to encapsulate this 
type of information.! The Context is also useful for handling language and 
implementation specific details and can extend native language features if 
necessary. For instance, it can provide support for memory management 
and expression parsing with abstract data types. 


3.3. Applying the pattern 


Implementation of the GUM Tree pattern requires that a set module classes 
be defined, each of which implements the module interface in a different way. 
Module objects can be created from this class library and linked together to 
form a system. When in operation, a Context will interrogate the network 
of modules to determine measured values and their uncertainties. 
Equations (4) require only that a module be characterised by a simple 
function, fj, which may have an arbitrary number of arguments. Hence 


dOther module interface functions will generally be useful. For instance, an interface 
can include a function that returns an iterator for the set of system inputs on which the 
module depends directly or indirectly, this helps sum over 7 and j in equation (10). 

“In practice, a distinct interface class, derived from ModuleInterface, can be useful 
to distinguish between system inputs and module inputs. This could be used as the 
argument type in the second function declaration. 

f Another calculation that the Context would typically handle is the evaluation of effective 
degrees-of-freedom!. 
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modules may be distinct pieces of instrumentation, connected to the system 
by some kind of physical communications interface, or they may be abstract 
software entities in computer memory. 

An example of the latter is a class library for uncertainty calculations®. 
This effectively introduces a new abstract data type for measurement un- 
certainty calculations — an entity that has attributes for both value and 
components of uncertainty. Module classes can implement simple maths 
functions and arithmetic operations, allowing arbitrary measurement func- 
tions to be written directly in software. The explicit derivation of sensitivity 
coefficients is then unnecessary: a measurement function will be parsed au- 
tomatically into a tree of linked module objects, which can be interrogated 
to obtain values and uncertainties. 

As an indication of how simple such a class library is, consider the 
definition of a module class for division. The function definitions were given 
in equations (8) and 16. In addition, the class will need two ‘references’ 
(22 and x3), which link to a module’s inputs. Member functions use these 
references to obtain information (i.e., an output value or an uncertainty 


component). In C++ pseudo-code, the class definition for division looks 
like® 


class Division : public ModuleInterface { 
protected: 
ModuleiInterface& x2; 
ModuleInterface& x3; 


public: 
Division(ModuleInterface& 1,ModuleInterface& r) x3(1), x2(r) {} 


double value() { return x3.value() / x2.value(); } 


double uComponent (ModuleInterface& i) { 
double v = x3.value() / x2.value(); 
return 
2 * v * x3.uComponent(i) - 
v / x2.value() * x2.uComponent (i); 


Ji 


®Much like complex numbers, and mathematical support for them, can be added to some 
programming languages. 
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The member function value() corresponds to equation (8) and the 
member function uComponent (ModuleInterface& i) to equation (16) 
with i a reference to any system input module. The class construc- 
tor, Division(ModuleInterface& 1,ModuleInterface& r), establishes 
the required links to the module inputs. This type of implementation of 
the GUM Tree is closely related to the ‘reverse’ form of automatic differ- 
entiation °. | 

The extensibility of GUM Tree designs relies on the common Module 
interface. For example, the input links in the pseudo code above gave no 
hint has to the extent of the module network connected to each input. A 
link could point to a simple input or to a complicated module network. 
This generality allows exchange and extension to occur dynamically in a 
system. 

For instance, equation (6) attributed a system input value of R to the 
measurement equation. However, as a modification to the system, a tem- 
perature dependent resistance 


R = 22 = Rofl — a(T —To)| (17) 


could be associated with that step in the calculation. The network of 
modules representing the new resistance calculation are then connected at 
this point (i.e., a decomposition tree of modules representing equation (17)). 
This change does not require any changes to code in other modules." 


4. Discussion and Conclusions 


The GUM Tree design pattern is a simple technique that represents a sig- 
nificant improvement over current practice and should ultimately result in 
more reliable and robust measurement and control systems throughout so- 
ciety. It could be deployed in most modern instrument designs where it 
would allow systems to report measured values together with uncertainties. 

Current trends in the test and measurement industry are towards in- 
telligent and flexible system components with industry-standard inter- 
faces to facilitate exchange, reconfiguration and long-term maintenance 
requirements**?. The modularity and extensibility of the GUM Tree pat- 
tern is fully compatible with this trend and has the potential to substan- 
tially enhance the expected benefits. The GUM Tree design enables systems 


hThe calculation of combined standard uncertainty will naturally be different and will 
include the additional uncertainty terms. However, this calculation can easily be made 
generic and parameterised on information available from the module interface. 
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to dynamically report uncertainty as operational parameters change, which 
greatly enhances the integrity of a system’s measurement results. One inter- 
esting possibility is to reduce the cost-of-ownership burden associated with 
revalidating test and measurement systems when components are changed. 

The information required for the uncertainty function of the GUM Tree 
module interface would be known during the design of an instrument or 
sensor, so there will be little difficulty in providing software to report it. 
On the other hand, legacy instruments could be endowed with a GUM Tree 
compatible interface, by introducing an additional software layer similar to 
the ‘Role control module’ approach described in 3. 

The new software data type, representing measurement results by en- 
capsulating value and uncertainty information, is particularly interesting. 
It could be used directly as a convenient tool for data processing (an alter- 
native to ’uncertainty calculator’ applications). However it is potentially 
more powerful in the internal software of measuring instruments. An instru- 
ment’s calculations would then be following the Guide’s recommendations 
and handling uncertainty at a very low level. This would not compromise 
proprietary algorithms, but would give external access to the sensitivity 
coefficients required for uncertainty calculations. 

In conclusion, the GUM Tree design pattern is an elegant approach to 
instrument software design that allows the ‘Law of propagation of uncer- 
tainty’ to obtain measurement results that include a statement of uncer- 
tainty, as described in the Guide. The technique is applicable to measure- 
ment systems of many kinds, large and small. It supports modularity and 
extensibility, which are key requirements of modern instrumentation sys- 
tems, and does not impose significant performance requirements. There is 
increasing pressure on test and measurement manufacturers to provide sup- 
port for uncertainty that follows the recommendations of the Guide. The 
GUM Tree design pattern offers a solution to this problem. 
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The identification of the phase transition is required in thermometry in order to 
reproduce the International Temperature Scale of 1990 (ITS-90) defining fixed 
points. 

This paper proposes the use of statistical hypotheses testing methodologies for 
phase transition identification by simply monitoring the temperature behaviour. 
Only the pure structural change model is taken into consideration, that is the 
model in which the components of the parameter vector are allowed to change all 
together. 

A statistical test for a single known change point is briefly presented in the paper. 
The procedure is extended to the more interesting application of the detection of 
a single unknown change point. A sliding window algorithm is proposed for the 
on-line detection of a possible change point. 

The method has been applied to identify the triple points of the four gases realizing 
the cryogenic range of the ITS-90. The pure structural change model has proved to 
be an innovative tool for a reproducible and reliable phase transition identification. 


1. Introduction 


It is generally recognized that a physical quantity experiences changes as it 
evolves over time, especially over a long period. In a parametric regression 
framework, structural change may be defined as a change in one or more 
parameters of the model in question. This can also be seen as a parameter 
instability problem. Deeper knowledge on the parameter behaviour gives 
more information on whether a particular model is correct or not. Poor 
forecasts may result if the model does not account for structural change. 


*Work partially funded under EU SofTools.MetroNet Contract N. GGRT-CT-2001-05061 
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There are a lot of examples in engineering, physics, as well as in metrol- 
ogy, where the identification of a defined phenomenon can be done by mon- 
itoring the parameter changes. It is about the processes that change from 
one regime to another, in a more or less gradual fashion, as in the case of 
the first order phase transition detection. In this particular case several 
physical quantities, such as entropy and volume show abrupt changes. On 
the other hand, phase transition occurrences correspond to well defined 
temperature states that can be used as fixed points as recommended in 
the International Temperature Scale of 1990 (ITS-90) [1]. Therefore, the 
identification of a phase transition is one of the fundamental aspects of 
thermometry. 

The procedures currently used to realize a fixed point generally require 
the judgement of a trained operator, to decide when the phase transition 
is reached. Because phase transition measurements are extremely time 
consuming and require highly reproducible conditions, it is more convenient 
to carry out these measurements automatically. During a constant power 
approach to the transition, the temperature dependence on time abruptly 
changes its shape, as soon as the phase transition occurs. A temperature 
plateau happens and the mathematical model parameters change in a very 
definite way. To detect this change the statistical test presented below has 
been employed. 


1.1. Assessed problem 


Suppose we deal with a sequence of time ordered data. That is, at the mo- 
ment n, Yn E Rand En = (%p3,... eg) € IR? are acquired, n = 1,...,N, 
where N is the sample size. Considering that Y is the response vari- 
able, denote by Y = (Yi, YN) € R^ the column vector of its ob- 
served values. Consider that X is the explanatory variable and denote by 


X = | titz... ay) € R5? the matrix of its observed values. 


Assume that the studied phenomenon can be described by a linear re- 
gression model: 


Y = X8B+e 


p = (Br: Bay... Bp) € RP (1) 
€ = (€1,€9,...,En) E Riad. ~ N(0,0) 


The main goal of the regression analysis is to find the best estimate 3 of 8 
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with respect to a chosen norm. That is, find B such that: 
min |Y - xel = |¥ - XA] (2) 
In this work the Lə norm was used, that is: 
ming |Y — X|| = ming (Y ~ X8) (Y - XP) 
=(Y- x8) (Y - xô) 


The problem assessed in this paper is the following: shall G be inferred using 
the entire sample of data or would it be more appropriate to split the data 
in two subsets and estimate the different subset parameters 64, G5,...? For 
example, looking at the simulated data in the figure 1 it is quite obvious 
that more correct information would be obtained if the data were split in 
two subsets around the point x = 32. 


(3) 


400 


300 


200 


180 


Figure 1. Simulation of a process showing a change of the model parameters around 
the point x = 32. 


2. Structural Change Model 
Consider the following linear regression model with a change point at No: 


MES x, (3, +€n,n=1,...,No (4) 
O ) w, Bo ten, n=Not+1,...,N 


where €n are iid, €n ~ N(0,c), By, = (Bir, --- Bip) € RP? and 


+ 


B2 = (b2,.. . Bap) € RP for some known or unknown change point 


212 


No € {2,...,N — 1}. Assume that the matrices paws Pnkn and 
yu No+1 n£n are both of full rank p. 

In this paper we take into consideration only the case of a pure structural 
change model, where all the components of the parameter ( are allowed to 
change and checked for variation all together. 

There are four statistical items to be discussed when considering a struc- 
tural change model. The first point is to assess whether a change occurred 
or not. That is, by statistical means, decide whether there is evidence to 
use a model like (4) to perform the required inferences, rather than the 
classical model (1) for the same purposes. In case we decide for a change 
occurrence, the second phase is to estimate when the change happened in 
the studied data set, for an unknown change point. The rest of the statisti- 
cal procedure is well known: thirdly, estimate the parameters and, fourthly, 
forecast future values. In this paper we discuss only the first item: assess- 
ing the occurrence of a change in the framework of a single possible change 
point. 


2.1. Parametric hypotheses testing 


In order to decide for change in the analyzed parameters, a statistical test 
must be performed. The main ingredients of a statistical test are the null 
hypothesis Hop and the alternative hypothesis Hı. Ho is the statement 
we trust unless the sample observations provide evidence favourable to the 
alternative statement Hı. Testing Hp against H, leads in fact to only two 
possible conclusions: a) “reject Ho” and b) “do not reject Ho”, without 
any inference about Hı. 

Another choice the operator has to make is the test significance level a, 
that is the probability of rejecting the null hypothesis Ho when it is true. 
This probability a is also called probability of 1% type error, see [2]. 

A statistical test is performed by means of a test statistic T. T is a 
statistic whose distribution is known when the parameter @ is known. T 
is generally determined on the basis of the parameters to be tested and of 
their distribution [2]. 

With the above mentioned items, a critical region CR is de- 
fined. CR is an interval containing possible values of T such that 
a = P(T € CR|Hp is true). A best critical region CR* is to be chosen, 
that is a CR with the following additional property: 


y= P(T ¢ CR’|H, is true) = min P (T ¢ CR|H, is true) (5) 
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~y is also called probability of 2"¢ type error. 
The decision provided by a statistical test is then defined by: 


T € CR* ==> reject Ho 
l (6) 
T ¢ CR* => do not reject Ho 
2.2. Structural change hypotheses testing: the known 
change point case 


In case of a known change point No € {2,..., N — 1}, the test to be per- 
formed in order to decide whether a change occurred or not is the following: 


Ho: B, = Be 
against (7) 
Hi: PB, # By, 


when the notations used in (4) are still kept. It is here that the assumption 
of a pure structural change model appears. 

Using the ordinary least squares (OLS) method as an estimation 
method, denote by RSS the sum of squared residuals obtained by solv- 
ing the model (1) and, respectively, denote by RSSj2 the sum of the two 
sums of squared residuals obtained by performing the two distinct regres- 
sions corresponding to the model (4). That is, 


RSS =Y |I- X(X'X)-1Xx"| Y 
RSS = Y, |I - X1(X,X1) 1X4] Yı + Yo |1 - X(X3X2) 1X; Y> 


where 
yı YNo+1 Xı, X No+1,. 

Yra ar Wey oe ii SA Cea faa 2. al ee (9) 
YNo YN X No.. XN, 


and A; denotes the i-th row of a matrix A. 


G. Chow |3} proved that the test statistic for (7) is given by: 


Toho = (RSS — RS Sı2)/p 
RSS12/(N — 2p) 
which has an exact F distribution with p and N — 2p degrees of freedom 
respectively. 
Applying the Chow test to the data presented in the figure 1, the test 
statistic T takes the value 6511.7, while the 99% critical value (defining 


(10) 
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the critical region) is 4.7. As expected, the decision to be taken is that a 
change in the parameter values must be considered. 


2.3. Structural change hypotheses testing: the unknown 
change point case 


Consider now only the simplest case of a single unknown change point: 


v3 + En, fe Sk NG 
Yn = | ; ; = (11) 
EnB + 2,8, +én,n=Not+l1,...,N 
where No E N. N C {2,...,N — 1} will be called “inspection interval” in 
this paper. Later we’ll see why it is necessary to introduce it. 
Then the statistical test to be performed is expressed as: 
Ho : By =p 
against (12) 
Hı: Bı #0 for some No. 
Testing for structural change with unknown change point does not fit into 
the regular testing framework because the parameter No only appears under 
the alternative hypothesis and not under the null one Hg. In consequence, 
the Wald, the Lagrange multiplier and the likelihood-ratio like tests do 
not possess their standard large sample asymptotic distribution and hence 
cannot be used. 
Denoting by: 


oe ('i0':... io". iy) ani 


Xo = (0. ioiei) c RN? 
the model (11) can be rewritten as: 
Y =X8+X08, +e (14) 
Let RSS(n) be the sum of squared residuals and let (G,,,3;,,) be the OLS 
estimator of (G, 3,) by regressing Y on X and X2,,. In the case of unknown 
variance g”, it can be estimated by ¢?(n) = RSS(n)/(N — 2p). The test 
statistic for the unknown single change point model is then constructed 
simply by generalizing the known change point case by defining first 
M = 1~X(X'X)~'X’ and W(n/N) = (in (XmMXon) Bin] /62(n). 
Then: 


(13) 


T= pw (y) as) 
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D. Andrews in [4] founded the asymptotic distribution of this test statis- 
tic. Note that in the case of the known change point model the ex- 
act distribution was found, while in this case only the asymptotic one 
can be obtained. Let B,(.) be a p-vector of independent Brownian mo- 
tions on [0,1] restricted to the inspection interval M. In [4], calling by 


Qp(m) = ((Bp(m) - *Bp(1))’ (B(T) - *Bp(1))) / (Œ(1 — 7)) a standard- 
ized tied-down Bessel process, Andrews proved that the statistic W(.) 
weakly converges to Qp(.), while T = suppen W(.) converges in distribu- 
tion to sup, Qp(.). He also proved that the test statistic does not converge 
on the entire interval. This seems to be reasonable as a change at the ex- 
tremes of the interval would mean in fact “no change at all”. This is the 
reason why the inspection interval must be kept away from the ends of the 
data interval. The critical values for the statistic supy Qp(.), with respect 
to various inspection intervals M, are obtained in the literature by Monte 
Carlo simulations; see [4] for example. An estimation No of the change 
point No is to be defined by the point for which the supremum (15) is 
reached. For any confidence level, confidence intervals can be constructed 
as in [5], for example. 


3. Application to phase transition identification in 
cryogenic thermometry 


We applied the statistical method presented in the previous section to iden- 
tify the triple points of the gases realizing the cryogenic range of ITS-90. 
The phase transition identification allows us to recognize a physical state 
corresponding to a well defined temperature that we want to reproduce. 
When the phase transition is approached, supplying a constant power to 
the system, the transition occurrence is characterized by a sudden change in 
temperature versus time behaviour, which corresponds to discontinuities in 
some physical parameters. As soon as the transition happens, the melting 
process starts and a temperature plateau occurs. The rapid identification 
of the change gives information on the entire melting process. 

In fact, by identifying the extremes of the melting, the total heat of 
fusion and the transition temperature at different melted gas fractions are 
determined. These parameter values are more accurately obtained when a 
“real-time” identification of the starting of the fusion process is performed. 

Using the statistical test previously presented, together with a simple 
temperature monitoring, we detected the phase transition by means of a 
completely automatic procedure. The identification was performed on-line 
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with a negligible delay with respect to the plateau duration. We actually 
applied this statistical test to the triple points of He, Ne, Og and Ar, 
covering the cryogenic temperature range. 

In cryogenic metrology other techniques, mainly based on discontinuity 
detection, are used for phase transition identification. All these methods 
generally perform approximations and computations for each data point 
acquired. 

In our case only the temperature monitoring is required: there is no 

need to know specific characteristics of the phase transition in study. In 
contrast with other techniques, no measurements of physical quantities, 
except temperature, must be performed to investigate the occurrence of 
the transition. 
In particular, phase transitions happening at unknown temperatures can 
be recognized. This means that the same technique, practically with the 
same parameters, can be applied for different phase transitions. Only the 
power supplied to the system must be scaled with temperatures. 


3.1. On-line change detection 


The change detection algorithms are mainly designed to be used off-line 
because they require the analysis of the entire data set. We used a sliding 
window procedure for the on-line detection of the phase transition. The 
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Figure 2. The sliding window procedure. 


statistical analysis was implemented in Matlab® and inserted in the overall 
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LabVIEW™™ data acquisition software. Anyway, it has to be noted that, 
being an algorithm, our statistical test can be written in any programming 
language. Moreover, being based on OLS, in a very extreme simplification, 
it needs only basic arithmetic operations in order to be implemented. 

In figure 2 the sliding window scheme is presented. As specified in the 
previous sections, the test statistic (15) does not converge on the entire 
interval. In practice this means that a check for change corresponding to 
all points cannot be performed. In order to be able to check all the data for 
change, the data windows must be overlapped. Let the overlapping interval, 
OT, be the intersection of two consecutive data windows. Unfortunately the 
overlapping introduces a delay in the change detection as we can state the 
change (or not) only at the end of the data window we currently analyze. On 
the other hand, not checking all the data means to accept the risk of never 
detecting the change, which, at least in our case, is more dangerous than the 
previous situation. The parameters to be defined when using the proposed- 
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Figure 3. Example of a Neon triple point realization. The data, the detected change 
point and the end of the window containing the detected change point are indicated. 
Parameter values: N = 100, a = 0.05,N = 70(centered), OI = 50. 


sliding window procedure are the statistical test level a, the window size 
N, the inspection interval M and the overlapping interval OI. We varied 
these parameters in order to discover how sensitive our problem is to their 
values. Needless to say, the choice of the test significance level œ can be 
made arbitrarily, also because the kind of change we are dealing with is so 
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abrupt that it should be detected at any significance level. ‘There is a strong 
relationship among the other three parameters. The greater the window 
size, the longer the inspection interval can be, but the analysis tends to 
an off-line one. The greater the ratio M/N is, the smaller the overlapping 
interval can be, and hence the faster the detection. We observed that the 
detection is not at all sensitive to the variation of these parameters, in the 
sense that the change is always detected. The only sensitivity is related 
to the delay introduced by the parameters choice. In figure 3 an example 
of a phase transition detection of a Neon triple point realization is shown. 
Using an acquisition rate of 1 s, with N = 100, M = 70 (centered), OF = 50, 
a = 0.05, the greatest delay we obtained in this application was about 1 
minute. Further studies on these parameters should be performed in order 
to minimize the delay. 


4. Conclusions 


The statistical test presented proved to be a powerful tool for determining 
whether a pure structural change has taken place. Furthermore, it charac- 
terizes the detected change by providing confidence levels and confidence 
intervals. Particular features of the test presented in this paper are the 
low computational cost and its flexibility. The procedure discussed here 
is based on least squares computations, and hence is not very burdensome 
computationally. On the other hand, the flexibility is shown by the fact 
that we can apply the same test to different kinds of data: phase transi- 
tion identifications, change in sensor calibration curves, study of ageing of 
materials, etc. 

The analyzed data proved that the structural change point model is a 
reproducible and reliable method for the identification of the phase transi- 
tion of the four gas triple points realizing the cryogenic range of ITS-90. 

In future work we intend to compare the pure structural change model 
to a partial change model and to a gradual change model, all of them being 
possible candidates to our phase transition problem. 
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A measurement process has imperfections that give rise to uncertainty in each 
measurement result. Statistical tools give the assessment of uncertainties 
associated to the results only if all the relevant quantities involved in the 
process are interpreted or regarded as random variables. In other terms all the 
sources of uncertainty are characterized by probability distribution functions, 
the form of which is assumed to either be known from measurements or 
unknown and so conjectured. Entropy is an information measure associated 
with the probability distribution of any random variable, so that it plays an 
important role in the metrological activity. 

In this paper the authors introduce two basic entropy optimization principles: 
the Jaynes’s principle of maximum entropy and the Kulback’s principle of 
minimum cross-entropy (minimum directed divergence) and discuss the 
methods to approach the optimal solution of those entropic forms in some 
specific measurements models. 


1. Introduction 


The statement of measurement uncertainty is not complete by merely stating 
point estimates, but usually is given by a coverage interval (or region) 
estimation associated with the results of a measurement, that characterize the 
dispersion of the values that could reasonably be attributed to the measurand, 
that is the quantity subjected to measurement [1]. 

This goal is achievable only if there are basic probability assignments to all 
the relevant quantities involved in the measurement process and if it is assumed 
that one has some empirical knowledge of their distributional characteristic. 

To fix the form of a distribution a rather personalistic point of view, which 
takes into account the available nonstatistical knowledge could be introduced. 
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It is clearly preferable to have an objective probability assignment because 
different researchers (laboratories) would infer the same distribution from the 
data associated with the same measurand obtained in the same way and in the 
same circumstances. 

The application of the Jaynes’s principle of maximum entropy (pme) could 
satisfy this exigency. In fact the concept of entropy is closely tied to the concept 
of uncertainty embedded in probability distributions: entropy can be defined as a 
measure of probabilistic uncertainty. 

In particular if we had to choose a probability distribution for a chance 
experiment (measurement) without any prior knowledge about that distribution, 
it would seems reasonable to pick the uniform distribution because we have no 
reason to choose any other (as stated by the Principle of Insufficient Reason) 
and because that distribution maximizes the “uncertainty” of the outcome. But 
what if we had some prior knowledge of the distribution? 

Suppose for example that we know of some constraints that the moments of 
that distribution have to satisfy. It is here that Kullback’s principle of minimum 
cross entropy, derived from Shannon’s measure of uncertainty or entropy, plays 
its role. 

The principle of minimum cross-entropy provides a general method of 
inference about an unknown probability density when there exists a prior 
estimate of the density and new information in the form of constraints on 
expected values. The principle states that, of all densities that satisfy the 
constraints, one should choose the one with the least cross-entropy. 

It is important to emphasizing that minimizing cross entropy is equivalent 
to maximizing entropy when the prior one is a uniform distribution. 

In conclusion, a measurement process can be regarded as a controlled 
learning process in which various aspects on uncertainty analysis are 
investigated and the substantial amount of information, got with respect to the 
conditions prior to the result after the measurement process ts performed, so that 
it can be connected to the entropy optimization principles, that are correct 
methods of inductive inference when no sufficient knowledge about the 
statistical distributions of the involved random variables is available before the 
measurement process is carried out except for the permitted ranges, the essential 
model relationships and some constraints, gained in past experience, valuable 
usually in terms of expectations of given functions or bounds on them. 

In this paper, by introducing the Jaynes’s principle of maximum entropy 
and the Kullback’s principle of minimum cross-entropy (minimum directed 
divergence), the authors discuss the methods to approach the optimal solution of 
those entropic forms in some specific measurements models. 


221 


2. The Entropy Optimization Principles and the Measurement 
Models 


Entropy is an information measure, in this paper we are mainly concerned 
with the entropy of a continuous distribution defined by [2]: 


H(f)=- Í f(n) ln fm) dm (1) 


being f(m) the density distribution function of a random variable M and 
assuming H(f)=0 if f=0. 

The principle of maximum entropy (MAXENT) states that, given a 
prescribed set of values uw, r = 0,1,2,..., of all the unknown distributions g(m) 


(may be infinite) that satisfy some constraints as: 
= fm g(m) dm r=0,1,2,.... (2) 


the best estimate of g(m) is the one with the largest entropy (1). 


The principle of minimum cross-entropy (MINCENT) is a generalization 
that applies in cases when a prior distribution, say p(m)+0, that estimates 


g(m) is known in addition to the constraints (2). 
The principle state that of all the distributions g(m)0 that satisfy the 


constraints, you should choose the one having the least cross entropy defined as: 
* m 
H(g, p)= feon) mE dm 3) 
e p(m) 


From the operative point of view the Lagrange’s multipliers method could 
be applied to determine the best estimate of g(m) in both cases. 


2.1. Application of the MAXENT to a measurement model 


Given a quoted measurement uncertainty represented by the interval [a, b| A 
let M be a random variable in the interval |a, b| representing the measure, 
whose statistical parameters are: the expected value E{M}and the variance 
Var {M } l 

The following relation is satisfied £ { i } = E’ {M} + Var{M}. 


Let S be the family of probability distributions every member of which, say 
f(m)the generic one, is consistent with the constraints: 


Feoi: [mf (m) dm = EM}; |m? f(m) dm = Var{M}+ E’ {M} (4) 


We have to determine in S that distribution whose entropy is greater than 
that of every other member of the family, using Lagrange’s multipliers method. 


222 


Taking into account equations (1) and (4), the Lagrangian is written: 


L= =f f(m)In f(m) an-A | f(m) dm —1|— al fm (m) dm- E\M j|- 


-al fm? fom) dm- var(}+ E} 


being /,,4,,A, the Lagrange’s multipliers. 
From the Eulero-Lagrange equation —In f(m)—1-—A, —A,m—A,m’ =0 we get 


x a y 
; A| m+—— 
f(m) = g o) gam dam = ge lit4) 54% [ i) X A, >0 (5) 


By introducing the following transformation on the Lagrange’s multipliers: 

















A 
A = ghet as A, S= (a A>O, o>Q 
A, 2A, 20 
we can write: 
1 (m-u f 
f(m)=A e (6) 
V2n0 
where 
b _(m-nF = = 
Ate ae of 2-H) of 2-4) 
qj 2NO a Oo Oo 
and 
[ee 
M(x) = e?dm=P\N\01)< x 7 
ee wW(0.1)< x} (7) 
By introducing the reduced normal density: Z(x) = e ? =Z(-x) 





V270 


we can write: 
f(m)= = fof)- ola) zm) (8) 


being č = at č =a,b,m. 


The unknown yw and o are determined by considering the 2nd and 3rd 
constraints in (4), so we have: 


u-o{0(6)- ola) '[2(6)-z(a)]= Em) (9) 


144° ek AN Ac o’ =Var{M} (10) 
Didj- Dla lb )- Ola 


223 


Example: A calibration certificate states that the mass of a stainless steel 
standard m, of nominal value one kilogram is 1 000,000 325 g and that “the 


uncertainty of this value is 240 wg at the three standard deviation level 
(k =3)”. The standard uncertainty of the mass standard is then simply 
ulm, ) = (240 ugVk=80 ug. This corresponds to a relative standard 
uncertainty u(m, /m, =80x10° the estimated variance is 
u’(m,)=6,4x10° g?. 
So that E{M}=1 000,000 325 g, Var{M}=6,4x10° g’ and 

a= E{M}-k„Var{M} =1 000,000 085 g 


b= E{M}+ k Var{M} =1 000,000 565 g 
u= EÍM}=1 000,000 325 g 


3 
c= JVar{M |: e = 81,08832 ug 


20(k)-1 


(mu 
e * with a<m<b 








l 
from (9) we have f(m) = 0.01237 
v27 
The maximum entropy is: 


MaxEnt = -f f(m)In(f(m)) dm = now V2 h Var{M } 


with W = of SE) P of 2-2) 


Oo Oo 
By considering the normal distribution in the interval (—co, +00) we deduce 


E\M} = ji, Var\M}= o’ and MaxEnt = ilo Jre) 


2.2. Application of the MINCENT to a measurement model 


For this application, we consider a general measurement process that treats 
multiple measurands and yields simultaneously multiple results or estimates. 

We assume that all the relevant quantities involved in the process are 
interpreted or regarded as random variables. 

In other terms all the sources of uncertainty are characterized by probability 
distribution functions, the form of which is assumed to either be known from 
measurements: the underlying probability distributions are sampled by means of 
repeated measurements and so they can be estimated through distribution-free 
influence, or unknown and so conjectured. 

We classify all the involved quantities into two principal sets represented by 
row vectors: 
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1) output quantities Y , in number of m 
2) input quantities X that comprise the rest of quantities, in number of n. 
Let (x, y) the actual realizations of (X,Y) in a particular occasion, they 


represent a state of the measurement process in that occasion. The process has a 
set D of possible states x, y)e D| that identify the joint domain of the random 


variables (X,Y). 


Further, often, it is possible in a measurement process to individuate 
mathematical and/or empirical models that link input and output quantities 
through functional relationships of type: 


Y =g (X, X, ); i= 1, m with m ### n (11) 


In practical situations the transformation defined by (11) is differentiable and 
invertible. 


The mutual behavior between the input quantities X and Y is statistically 
drawn by the joint probability density f (x, y) which can be written as: 
fle y)= f@)flylX =x) (12) 
where f (x) is the marginal joint density of the input quantities X and 
f (y| X= x) , are the conditional joint densities of the output quantities Y, 
given X =x. 
In the Bayesian approach we have another important linkage between f(x) 
and f, (x, Y= y), that is: 
fiel Z =y)=æcf&) yX =2) (13) 
where f (x| Y=y } is the posterior joint density of the input quantities X , given 
Y=y and f (x) is interpreted as the prior joint density which tells us what is 
known about X without knowledge of Y, while f ly |X = x)= elx, y), 
regarded as a function of x for prefixed output values y , represents the well- 


known likelihood; c is a "normalizing" constant necessary to ensure that the 
posterior joint density integrates, with respect to x , to one. 


We apply the Kullback's principle to the joint density function f (x, y) 
expressed by (12) and take into account the Bayesian relation (13) which 
discriminates between the prior densities of the input quantities X and the 


conditional densities of the output quantities given the input ones. 
To this end we introduce the joint cross entropy in the following manner: 
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n else} 8 


where dx = dx,---dx,, dy = dyp dy, and fe (x, y) is defined as an "invariant 


measure" function and it is a known “prior” knowledge on the measurement 
process. 


Practical solution for MINCENT 


By substituting eq. (12) into (14) and considering the domain D divisible 
into two sub-domains D, and D, corresponding respectively to the ranges of 
X andof Y we can write: 


S =S, + E{S (xX) (15) 


where S, = {ind f al- ! f(x) In nt as dx, having imposed the normalizing 








condition 
[Alyx =x)ay=1 (16) 
and ' 
JYX 
S(XJ= XJI 17 
(x)= fxn (17) 
so that: 


E{S(X)}= Jh [rlix)t ns dy (18) 


It is important to emphasize that, if flx) and AV | x) are constants, S, 
and §S,(X) coincide respectively with -H, and -H,(X), being H, the 
classical Jaynes' entropy referred to the density of the input quantities and 
H, (x) that one referred to the conditional density of the output quantities given 
the input quantities X . 

If S (x)= S, would result independent of X, by imposing the other 


normalizing condition: 
[f(x)de=1 (19) 


we should deduce the important additive property: 
S=S, +S, 
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Thus, in this case, the joint cross-entropy is equal to the cross-entropy 
referred to the input quantities plus the cross-entropy of conditional density of 
output quantities given the input ones. 

If the constraints on the input quantities do not interfere with those ones of 
the output quantities in order to minimizing the general joint cross-entropy we 
can minimize before with respect to f (ix = x) and than to f(x). 

In a first step we minimize the expression S,(X), given by (17), subject 


either to the normalizing condition (16) and to other additional integral 
constraints of type: 


Efe XK- fey. X)f(yX=x)dv=e2(X) i=l (20) 
where the functions g, (y, X ) and the values g, (xX) are given, and in general 
they can depend on X. 


The minimum of cross-entropy S, (x) subject to constraints (16) and (20) 


is found through the well-known Lagrange variational method. 
We must solve Euler's equation for the functional: 


TVX = x) ! 
X} In —-A —YAgty,X 21 
prow) x-a -Agag 21) 
being 4, (k=0, 1,..., A the well-known Lagrange's multipliers. 


After simple manipulations we find the following solutions: 


fx =x)= F(X = ze ie (22) 


By substituting (22) into (17) we deduce the minimum of cross-entropy 
S (X), that is: 


Sima(X)= JX =3)] A-1 Ea] A 23) 

By recalling constraints (16) and (20) we deduce finally: 
Sm (X)=4-1+24g, (X) (24) 

Now, from (18), by taking into account (19), we obtain: 
EWS un X)}= A 1+ EA, Ble, (X) (25) 


Let us introduce (25) into (15) and we derive: 


meio) ee 
= En 1+¥,¢,(X) (26) 





Now we consider, besides the normalization condition (19), additional 
constraints of type: 
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Btw (X)}= weed i= av a7) 
Do 
where the functions w(x) and the values w, are known. The minimum of S 
with respect to f (x) subject to constraints (19) and (27) is given, as 
previously, by introducing new Lagrange's multipliers 4, (i= 0, 1, ...,v). 
We must solve a new Euler's equation for the following functional: 


o mB} +4 -1+0A,.2,(x)-A, Sam) (28) 


x 


After simple manipulation, taking into account eq. (24) we obtain: 
-Simin (+ Ay -1+ È 4, (x) 
flx)= flx)e i (29) 


Example: The repeated measurements 


We repeat n times the measurements process on the same measurand 
according to the "repeatability conditions" established by the "Guide to the 
expression of uncertainty in measurement" (ISO 1993). 

We may conveniently regard any set of measurement results y = (y, 5 y,) 


as the n-dimensional realization of an induced random vector Y =(¥,,-+Y,), 


which we can call, output vector. 

Let us now assign the random variable X to the measurand and express its 
occasional realization as x, which we suppose constant during all the 
replications. We suppose that the conditional statistic parameters are given by: 

EW |X =x}=x 
EY ~x) IX = x}= o° 

Let f (x, y)= f (x) f (yix) the unknown joint probability density of the 

measurand X and the random vector Y , and falx, y)= Ta (x)f, (yix) the prior 


i=1,...,7 (30) 


joint density, memorizing all the previous knowledge and experience. 
Assuming the conditional independence of the output quantities Y,,---,¥, we 


can write: 
r(yx)= lls (v.x) 
s o)= TA) 


f (y, x) and f, (y, x) being the common marginal conditional densities at the 


(31) 


generic argument y,, assuming X = x. 
Taking into account eq. (31) we deduce the joint cross entropy in the form: 
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f(x) 
S(x)= x) In-——~ d (32) 
e-n [ron a 
Using Lagrange’s multipliers method the MINCENT density is given by 
T= no) ee acan (33) 


where the Lagrange multipliers 4,,A,,4,, are determined by using the given 
constraints. Assuming, for simplicity, a constant prior density f, (yx) =k, after 
simple manipulation, taking into account eq. (31) we obtain: 





ae n 
r(yx)= e? whee g= + 5(y, =x) (34) 
2 ao 7s 
Now we consider the Bayesian inference and introduce the posterior density 


of the measurand that is: 
fe bdy, ke F(x)¢ ly, lx) (35) 
where y = (v, aa ) is the set of effective results at the particular occasion 


when X=x and c is a convenient normalizing constant. 
By introducing (34) into (35) and assuming f(x) constant, we obtain: 


n n 
1 2 > } 
—=) (vo, -xP - G -2 Yoj e] 
' a: k ial 


fly )= ce (a =ce 


where c` is the up-to-date normalizing constant. 
The posterior expectation and variance of the measurand are given by: 


E ix | Y, \ c jet ka i = Liy, = Vo > Var (xiy f a Z (37) 





ł 
20° 


(36) 


i=] 
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A model for atomic clock errors is given by the stochastic differential equation. 
The probability that the clock error does not exceed a limit of permissible error 
is also studied by means of the survival probability. 


1. Introduction 


The behavior of the atomic clocks is typically studied considering their 
phase and frequency error with respect to an ideal clock. This quantity is known 
to be well modelled by a stochastic process. We present a formal mode! of the 
clock phase error based on the theory of stochastic differential equations. In 
addition, we study the problem of the first passage time of the stochastic 
processes across two fixed constant boundaries. This topic is also known as 
survival probability between two absorbing barriers and it can be studied by 
means of the infinitesimal generator and by the use of simulation or numerical 
techniques. The two approaches are discussed and compared in order to show 
the importance of accurate numerical schemes to solve the equations. 

We present the application of these results to the evaluation of the 
probability that the clock error exceeds the limit of permissible error at a certain 
time after synchronization. This aspect has interesting applications in satellite 
systems and also in the frame of the Mutual Recognition Arrangement where it 
may be important not only demonstrating that two standards are “equivalent”, 
but also evaluating how long such equivalence can last. Results on experimental 
clock measures are presented. 


" Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
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2. Mathematical model for the atomic clock error 


Typically the clock signal is affected by five random noises that are known 
in metrological literature as: White phase modulation, Flicker phase modulation, 
White frequency modulation (WFM), that produces a Brownian motion (BM, 
also called Wiener process) on phase, Flicker frequency modulation, Random 
walk frequency modulation (RWFM) which produces on integrated Brownian 
motion (IBM) on the phase. The presence of the different kinds of random 
noises and their entity is different from clock to clock. From experimental 
evidence it appears that on Cesium clock the predominant noises are the WFM 
and the RWFM that corresponds in mathematical language, to the Wiener 
process and to the Integrated Wiener process respectively on the phase error. In 
this paper we focus on the atomic clock model based on these two main noises. 
Denoting with X,(f) the time dependent evolution of the atomic clock phase 
error, it can be viewed as the solution of a dynamical system, written as a 
system of stochastic differential equations whose driving terms are two 
independent standard Wiener processes {W,(0), tz 0} and {W,(0), tz 0} 

dX) =X +u )dt+a,dWi) y>9 (1) 
dX (t) = u dt +0o,d W6) 


with initial conditions X,(0) = x, X,(0) = y, where u, and ¿n are constants that in 
the clock case represent what is generally referred to as the deterministic 
phenomena of the atomic clocks. In particular x, is related to the constant initial 
frequency offset, while 4 represents what is generally indicated by frequency 
ageing or drift. The positive constants o, and o represent the diffusion 
coefficients of the two noise components and give the intensity of each noise. 
The relationship between the diffusion coefficients and the more familiar Allan 
variance used in time metrology is known [1]. W,(2) is the Wiener noise acting 
on the phase X, driven by a white noise on the frequency. Since the Wiener 
process can be thought of as the integral of a white noise, the phase deviation 
X, is the integral of the frequency deviation affected by white noise. W(t) 
represents the Wiener process on the frequency (the so called “Random Walk 
FM”) and gives an Integrated Wiener process on the phase. 

Note that the initial conditions x and y should not be confused with the 
metrological notation where usually x represents the phase and y represents the 
frequency. Here we consider x and y as numbers but they can be considered as 
random variables (cf. Section 3). Note that X, represents the phase deviation, 
while the frequency deviation results to be ,. The second component X, is 
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only a part of the clock frequency deviation, i.e. what is generally called the 
“random walk” component. 

Since equation (1) is a bidimensional strictly linear stochastic differential 
equation, it is possible to obtain its solution in closed form [2,3] 


2 t 
X =x+(y +u)t+ p, + oWO +o, [W {sds A 
0 
X (Ü) = y+ pt + 0 W0) 

This solution can be written in iterative form, that results very useful for 
further processing of the data and for simulations [3]. Let us consider now a 
fixed time interval [0,7] c R’ and a partition 0 = < tı<...< ty = T equally 
spaced and let us denote the resulting discretization step with T = t1)-t, for 
each k= 0,1,...,N-1. We can express the solution (2) at time t+ in terms of the 


position of the process at time t 
2 


X(t4)= X,(t,)+(X,(4,) +4) T+ pf, stu 
X(t) = X,(t,)+ Mt +d, 


where J, and J;,2 are the two components of the vector Jg: 


(3) 


2 2 t 2 T? 
J= o (Wta) - W(t, ))+ a fo (W,(s)- W,(t, ))ds ~N| 0, we =~ 3 S 2 
Oy (W,(t,.1)- W, (t, ) a, = o, T 


The vector Ją is the “innovation”, i.e. the stochastic part that is added to 
build the process at the instant f}. It depends only on the increments of the 
Wiener processes in the interval (4, t1). Such increment, W(t,.,)-W(t,), is 
distributed as a Wiener process originated at the instant ¢, on an interval of 
length q, i.e. N(O, t ). 


3. The survival probability and infinitesimal generator 


Let X(t) be a homogeneous stochastic Markov process starting in xo at time 
t= 0, (X(), 0; X (0) = xo} and let Tonm be the first passage time of X) 
through two constant absorbing boundaries —n and m 


Ti-nm) = inf { t 2 0: X(t)¢ (-n,m); X(0)e(—n, m) VO <t;X(0)= Xo < 
Let f(t,x,B)=P(X,,€B|X,=x) be the transition probability density 


i+s 


function of X, i.e. the probability that the homogeneous Markov process that 
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starts in x at time s, reaches the set B at time t+s and let p(¢,x) be the probability 
that the process starting in x at time ¢ = 0 does not reach the boundaries in the 
time interval (0, 
PILx) = P(t, ZX, =x) 
This function is called survival probability. The application of the survival 
probability is very important in our applications because we can interpret it as 
the probability that the clock error has not exceeded the limit of permissible 
errors at a certain time after synchronization. 
The infinitesimal generator is defined [4,5] as the operator A s.t. 


Ag(s) = lim SỌ, g6 - g6] 
where g is a bounded en and : is an operator defined as 
x)= [sO txdy)= E*[e(X,)/ 
Ag(x) can be senate as = mean infinitesimal rate of change of g(X,) in case 
X; = x. It is possible to prove [4] that there is a strong link between the 
stochastic differential equation and the partial differential equations via the 
infinitesimal generator. Indeed, if we consider the m-dimensional general 
stochastic differential equation: 
dX, = U(X, )dt + 0(X , dW, 
where w: R” >R”, o: R” —>R”” are measurable functions, x eR”, the 
following expression provides the E generan: 


L -3 Lowe" PE ax ar, uo (4) 


where o’ is the transposed matrix. 

The infinitesimal generator is very important in the study of the stochastic 
processes since it allows to determine stochastic quantities solving partial 
differential equations. 

For example, the infinitesimal generator may be used to study the survival 
probability by means of the following partial differential equation: 


L p(t,x)= a 2 


p(0,x) = ly ©) 
p(t,x)=0 on [0,T|xaD 


n [0,7[xD 


where the function p, solution of (5), is the survival probability defined on the 
time domain [0,7] and on the space domain D. 

The same partial differential equation applied to the transition probability 
density function leads to the Kolmogorov backward equation. 
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3.1. Examples 


In this section we study to the clock model introduced in Section 2 using 
the PDE (5). In particular, if we consider the bidimensional model (1) with 4, = 
0, ¿m = 0 without loosing generality, we get that the infinitesimal generator 
related to this stochastic process is: 

ancien ie ame 
dx 2 FTN 2 


(eG lols 2) 


As a first approach, we focus our attention on the separate studies of the 
simple BM or of the IBM. The BM can be defined via the following one- 
dimensional stochastic differential equation: 

dX , = udt +odW,, (6) 


where now u e R, o e R*are constants. The infinitesimal generator of (6) 


n n 
a 


obtained by (4) using: 


results to be: 
EE (7) 


Note that this is the generator of the classical heat equation. In this case it is 
possible to derive the closed form solution of the PDE (5) i.e. the survival 
probability of the BM that is [6]: 


I g ux, (x-x =H} ea (x -x; i) 
t, — >. — meee = 
ae ov 2nt Lie g’ 2071 par 20°t 
where x, =2k(m+n), x" =2m—x', with k= 0, +1, +2,.... 


On the other hand, the IBM is defined by the SDE (1) considering o= 0, 
Hy = 0. Without loosing generality, as a first study case, we set u = 0 and the 
infinitesimal generator results to be: 





L ee nce (7) 
ax 2° dy 
For the IBM the only known analytical result concerns the mean value of 
the first passage time across two fixed constant boundaries [7,8]. The analytical 
expression of the survival probability is not known, then we need numerical 
methods or simulations. 
In the following we present two numerical methods that allows to obtain an 
estimation of the survival probability [9,10]. 
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4. Finite difference and Monte Carlo methods 


One numerical method, that makes use of the SDE is a Monte Carlo 
method. It consists in the simulation of a given number of realizations of the 
considered stochastic process and in the identification of the first passage time 
Tinm) Of each simulated trajectory across the two constant boundaries —n and m. 
From the statistics of tn) it is possible to estimate several quantities such as 
the survival probability. 

The second numerical method makes use of the PDE (5) with the aid of the 
finite difference method. This method is based on the discretization of the 
domain and on the substitution of the continuous derivatives with discrete 
approximations 


Op _ Piss j 7 Pij +o(h) 


Ox h 

Op Ping ~2Piy + Pity 2 
— = + oh’). 
Ox? h’ a 


Hence, in this case the survival probability is estimated on a discrete grid. 


4.1. Application to the Brownian motion 


In this subsection we consider the application of the two numerical methods 
to the simple case of a BM (6) with u = 0. 

In order to apply the Monte Carlo method, we simulate N =10° trajectories, 
using the corresponding iterative form, obtained as solution of (6) [2], 

X (te ) = X(t, )+ o (Wt,., )- Wt, ) 

with discretization step t = 0.01 and diffusion parameter o = 1, and we detect 
the first passage time of each trajectory across two fixed constant boundaries —” 
and m with m =n = 1. 

On the other side, the PDE (5) with the BM generator (7) is written as 

16° px) g = PUL) 


2 a at eee, 
p(t—n)=9 xe D=[-n,m} 
p(t,m) =0 

p(0,x)=1) 


and the application of the finite difference method leads to the following 
discretization scheme in terms of the survival probability p(t,x) 
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1 Pij- E 2D, ; + Py, ja g? = Pij Pin 


2 h? h, 
p(t1)=0 

p(t,-1) = 0 

p(0,x) = 1). 


This scheme is implemented with discretization steps h,= 0.1, h,= 0.01. 

In figure 1 the survival probabilities obtained with the analytical solution, the 
Monte Carlo method and the finite differences method are compared but, since 
the three estimated curves result superimposed, the analytical and numerical 
solutions perfectly agree. 


Pp, 





t 


Figure 1. Survival probabilities obtained with the analytical solution , the Monte Carlo 
method and the finite differences method in case of BM (all lines are superimposed). 


4.2. Application to the integrated Brownian motion 


In this section we focus our attention on the IBM case in the case p= 0. In 
order to apply the Monte Carlo method, we simulate N = 10° trajectories, using 
the corresponding iterative form obtained from (3) with 4; = 0, = 0 


Xita) = XN) + 40, )e+o, [e W(s)ds 


X(t) T X,(t,)+ 0» (W(t,.1)- Wit, ) 
with discretization step t = 0.01 and diffusion parameter o, = 1, and we detect 
the first passage time of each trajectory across two fixed constant boundaries —n 
and m with m =n = 1. 
The survival probability p for the IBM is defined by the PDE (5) with IBM 
generator (8) 
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ee ae 2 t e [0,7] 
Ot Ox 2 Oy 

pít, —-n,y)=0 y>0 (x, y)e D= {—n, m]x R} 
pít,m,y)=0 y<0 

p(t,x,-20) = 0 

p(t,x,0) = 0 


pP(0,x, y) E lp 


Here we notice that the unknown p(t,x,y) depends on the time ¢, the initial values 
x and y. The application of the implicit difference method leads to the following 
discretization scheme according with the method presented in [7] 


l “a Pi j-ika = 2D ia Pi jie iy Pisga T Piga — Pigjka ~ Pijk 


— 


2 h? ‘ h h, 


y x 
i=0, n, J=, k=0,..n,-1 


y 


a h? h, 


y 


l 42 Pi j-l,ka1 T 2P; jk + Pigra _ Pijka — Pijk 


i=l. n, j=0 k=0,..,n,-1 


x 


] g? Pi+i,j-l,k+l — 2 Piti j ka + Piri jelk+l + y Piri j k+l T Pi jkn _ Pisiiy kes — Pirl j,k 
2 2 i 5 
2 h, 7 h, h, 


i=Q,..,n, J=- l1 ka Onn ol 


x 


We then get a linear system of 2(7, +1)n,n,+n,n, variables. The matrix involved 


is sparse and, even if it is not symmetric, it is a band matrix. 

In figure 2 the survival probabilities obtained via the Monte Carlo method and 
the finite differences method implemented with discretization steps h, = 0.04, 
h,= 0.5 and h,= 0.05 or h,= 0.01 are compared. 
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Figure 2. Survival probabilities obtained via the Monte Carlo method (solid line) and the 
finite differences method implemented with discretization steps h, = 0.04, h, = 0.5 and 
h,= 0.05 (dotted line) or h,= 0.01 (dashed line) in case of IBM. 


We can notice that the three curves agree to a large extent but some difficulties 
arise in managing very small discretization steps since it leads to the handling of 
very large matrices. 


5. Applications to the atomic clock error 


The survival probability is an interesting quantity from the applicative point 
of view since it tell us how long we can wait, before re-synchronizing the clock, 
to be sure, at a certain percentage level, that the error has not crossed two fixed 
constant boundaries. In fact if we evaluate the abscissa £ corresponding to a 
survival probability, for example prob = 0.95, we can estimate the time t that we 
should wait before the process crosses one of the boundaries with probability 
0.95. 

In Table 1 we report the time ¢ (in days) that can elapse before the clock 
error exceeds the boundaries for different percentage levels and different values 
of the boundaries m = n. These results were obtained for a BM with drift u = 0 
and variance coefficient o = 3.05, i.e. the noise level of a typical cesium clock. 


Table 1. Values of survival time ¢ (in days) corresponding to 
different percentage levels of the survival probability and to 
different choices of the boundaries m = n (in nanoseconds) 
(BM case with u = 0 and o= 3.05) 
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For example, fixing the maximum error allowed at +10 ns, from Table 1 we 
can deduce that we can wait two days being sure at 95% that the error has not 
yet exceed the permissible error. 

Table 2 concerns a similar analysis of an IBM with drift 2 = 0 and variance 
coefficient œ = 1. Also in this case we choose the parameters of a typical 
Cesium clock. Fixing the maximum error allowed at +10 ns, from Table 2 we 
can deduce that we can wait four days being sure at 95% that the error has not 
exceed the permissible error. 

Table 2. Values of time ¢ (in days) corresponding to 
different percentage levels of the survival probability (IBM 
case with 4» = 0 and œ= 1) 





6. Conclusion 


The paper describes the model of the atomic clock error using the 
stochastic differential equation. In addition, the related properties such as the 
Partial Differential Equation describing the survival probability have been 
studied because they can help in understanding and predicting clock error in 
many different metrological applications. In fact, the study of the survival 
probability indicates how often a clock has to be re-synchronized not to exceed 
a limit of permissible error. The accurate study of these properties are of great 
interest also for the applications to satellite systems and to the Mutual 
Recognition Arrangement. 
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In metrology the measurements on the same standard are usually repeated several times in 
each Laboratory. Repeating measurements is also the idea behind making inter- 
comparisons in different Laboratories of standards of a physical or chemical quantity. 
Often, whether these results can, or not, be considered as repeated measurements is not 
obvious, so that the statistical treatment of the data as they are repeated data can bring to 
misleading results. The paper reviews the use of two classes of methods keeping track of 
the fact that the data are collected in series: a) those considering a class of regression 
models able to accommodate both the commonality of all series and the specificity of 
each series; b) those using the mixture probability model for describing the pooled 
statistical distribution when the data series are provided in a form representing their 
statistical variability. Some problems related to the uncertainty estimate of the latter are 
also introduced. 


1. Introduction 


An essential method in metrology consists in repeating in each Laboratory 
several times the measurements on the same standard and this basic method also 
applies to inter-comparisons in different Laboratories of standards of a physical 
or chemical quantity. 

In both cases, a basic question in order to decide the statistical treatment of 
the overall data is whether they can, or not, be considered as repeated 
measurements. Often the answer is not obvious, and consequently the statistical 
treatment of the data as a compound —data fusion— based only on the a priori 
assumption that they are repeated data, can bring to misleading results. 

Particularly important in metrology are the mathematical and statistical 
methods that are able to take into account the fact that the data are collected in 
series. Using models that are able to take into account both the specificity and 
the commonality of the data series can better perform the operation of collation, 
often also called “data fusion”. Then, statistical tests can be performed a 
posteriori on rational bases on the estimated quantities to verify the repeated- 
measurement hypothesis. The paper discusses the use of compound modelling: 


* Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
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a) when the data depends on one independent variable in a range. They need 
the fitting of a regression model function, which needs to accommodate 
both the commonality of all series of data and the specificity of each 
series; 

b) when each series of data are not provided but a “summary” form is. The 
summary represents the data statistical variability, e.g., when the 
probability density function (pdf) is provided as the overall information 
concerning each series of data. In these cases, the use of a mixture 
probability model for the pooled statistical distribution does not require 
the repeated-measurement hypothesis. 


These methods are summarised in the paper and can be applied either to data 
series taken in a single Laboratory or to inter-comparison data. 


2. Data in Series 


Several measurements are taken of the same standard, yielding to data: 


x1?) 0.3 ty (1) 


where each x,” is affected by a random error ¢,‘". Several series of these data 
exist (e.g., measurements taken at different times, in different Labs, in different 
conditions, ...): 


x, s.. AN j 


(2) 


wM (M) 
X SEES AN y 


The metrologist has to devise a summary of the information conveyed in 
M 


the $ Nm data (2). The procedure bringing to the choice of the summary 
m=] 
comprises, at least: 
a) the identification of the measurand(s) according to the purpose of the 
measurements; 
b) the choice of a statistical treatment well suited to obtain the expectation 
value of the measurand(s); 
c) the estimate of the uncertainty associated to the information represented 
by the overall data (2). 
The importance —-and sometimes the difficulty— of step a) is sometimes under- 
estimated. The identification can be difficult or even controversial, arising from 
the purpose of the measurements and/or of the data analysis. In all instances, it 
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determines the type of mathematical modelling and of statistical treatment that is 
more appropriate for the subsequent steps. 


Let us use for each line of (2) the following 


DEFINITION: The series is a set of repeated measurements (usually 
homoscedastic or with var(é,) = const). 


2.1. Types of standards 


Different types of standards require different answers to the question a) about 
the meaning of the series of data. The meaning is different for two general 
classes defined in [1,2]. 


a) Set of artefact standards 


DEFINITION: An artefact standard is called a measurement device, a “natural 
value” of which does not exist. 


Examples are the mass or the length of a piece of metal. Each artefact carries its 
own value of the physical or chemical quantity, which can beestimated only by 
calculation and/or through comparison with others. From a statistical point of 
view, having a set of M artefact standards, each m-th artefact —one line in (2)— 
is a different measurand and each measurement series x” pertains to a different 
population, i.e. to a different random variable X”. The measurement series can 
be obtained either in one Laboratory (M local standards) or in a set of M 
Laboratories (inter-comparison of M standards of different Laboratories). 


b) Single standard 


A definition of a single standard applies to several cases: 

i) each single standard in a Laboratory, where M series of measurements 
are performed on it, typically at subsequent times (an artefact also 
pertains to this category as far as it is considered individually and M 
infra-Laboratory evaluations are concerned); 

ii) single travelling standard used in an inter-comparison of M 
Laboratories; 

iii) set of M realisations of the same state of a physical quantity, a natural 
value of which exists, representing the single standard (concerning 
either infra- or inter-Laboratory measurements). 


In all these cases, from a metrological and statistical viewpoint, there is only one 
measurand and all data x) pertain to the same population, i.e., to a single 
random variable, Q, while still the data is subdivided in series. 
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2.2. One random variable versus several 


There is a basic difference in the statistical reasoning that must be applied to the 
data series, depending on whether they deal with one or with several (two or 
more) random variables. 


a) Several variables (as in class a) standards), one for each m-th data series. 


The synthetic information concerning the set of variables X” involves a 
summary statistic, typically a mean value, E. 


The new random variable Æ is at a different (upper) level with respect to the 
input variables X”, i.e., a hierarchical reasoning is introduced [6]. The 
uncertainty evaluation involves products of the probability distribution 
functions F” associated to the X™ —convolution in the linear case. In fact, 


there is uncertainty propagation in the sense of GUM [3]. 


Xm) Fim) "e 





Figure 1. Typical GUM model. 


b) One variable (as in class b) standards), M data series. 


The operation of compounding the x,” data is called data pooling, also referred 
to as “data fusion” (of the series) [12]. Since the variable Q is common to all 
data, no input-output model (Fig. 1) of the measurements exists and is needed. 
A single probability distribution function F is associated to Q, as the union 
(summation) of the distribution functions F” of each series of pooled data. 


2.3. Several data series for a single random variable 


The opposite attitude of overlooking the occurrence of a single standard and, 


thus, of always considering each m-th series as pertaining to a separate random 
M 
variable, is to assume all data series as pooled into a single series of }°n,, data, 


m=\ 
overlooking —and losing— the information related to the possible series 
individuality. 


244 


On the contrary, series individuality is information additional to the one 
provided by their commonality, which should be analysed in order to allow the 
metrologist to reply to important questions (see Section 2.1.b): 

Case i) (one standard in a Laboratory)}— Is the standard stable in time? Is it 
reproducible? Can the data of subsequent series taken in the same 
Laboratories be considered as repeated measurements? 

Case ii) (single travelling standard of an inter-comparison)— Is the standard 
stable in time? Is it reproducible? Can the data of all the Laboratories 
participating to the inter-comparison be considered as repeated 
measurements? (similar to Case a) 

Case iii) (physical state realisation, either in one or in several Laboratories)— 
Are the realisations of the standard equally well implementing the physical 
state? Can all realisations (data series) be considered as repeated 
measurements? 


2.4. Compound modelling techniques 


The models keeping explicit both the individuality of each series and the 
commonality of all data will be referred to in the following as “compound 
models” and two techniques using them are briefly discussed in the following: 
e the input-output model: ye = f(x”) for data series; 
e the model of the statistical distribution function for single-standard 
problems. 

A more comprehensive discussion can be found in [2,4—6]. The basic 
advantage of their use is that, after having determined the parameters of the 
compound model, statistical tests can then be applied to the series-specific 
parameters to check if they are significant or not. If not, all the data can be 
considered as repeated measurements: this decision 1s now based on a statistical 
analysis of each series and not an a priori unverified assumption. If all or some 
of the series are significantly specific, the metrologist has a sound basis for 
further inference and data analysis. 


3. Compound-modelling of Data Series in Regression 


In regression of data collected in series, the so-called “Least Squares Method 
with Fixed Effect” (LSMFE) uses a model written, in general, as: 
yi” = Common(x)) + Specific(x;””) (3) 


with m = 1,..., M and i= i.,..., Am. For example, with a polynomial model for 
both parts, and of degree one for the Specific one, the regression model is: 
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U M 
g(x) = Law" + 2, Sn (tm + bmx) (4) 


u=0) m=1 


where 6,, = 1 if x is in the m-th series, otherwise is zero and the a,, the tm and the 
bm are regression parameters to be determined. The (U+2M+1) regression 
parameters are independent only if all t} sum to zero (or other fixed prescribed 
constant). If the r-th series is taken as the “reference”, the r index is skipped in 
the second summation and only (U+2M-1) parameters are to be computed. A 
statistical significance test can then be applied to the 4, and bm. 

In figure 2 the simplest possible case of series differing only by a 
translation is reported, less visually evident when Type A standard uncertainties 
are large, as in figure 3. More applications can be found in [4, 5]. 


y 
(thermometer resistance, 
ihermophysical property, 

or physical poe: 
OF...) 





X(typically fame or 7 a...) 
Figure 2. Series differing by a translation: Specific = const (simplest example). 





Figure 3. Example of series differing by a translation with predominant Type A standard 
uncertainties. 
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4. Compound Probability-Distribution Models for Single-Standard 
Data Series 


Let us now consider the problem of describing data variability in an inter- 
comparison (IC) of standards. The n-th participant’s data series is of the form 
(1) and (2) represents the overall input data to the IC, one series for each 
participant (local data). However, in the case of an IC, the local data are 
normally summarised to provide three items, representing the overall statistical 
properties of the local standard: 


{(Y,+ U,); Fa}, NN (5) 


where Y,, is the value assigned to the local standard and, in the probabilistic 
sense, is the location parameter of the local population and U, is the expanded 
uncertainty associated to Y, generally set at 95% confidence level. 7, is the 
statistical distribution associated to the local standard to identify the population: 
at least the general class of models assumed for it should be provided [3]. 

Consider, however, an IC of standards based on different realizations of the 
same physical state (class b), case iii) in Section 2.1, e.g., temperature, pressure. 
For a full treatment of this case see [6]; for a comparison with the case of 
artefacts see [7]. Data are regarded as sampled from a single stochastic variable, 
Q. Therefore, for the sample (1,,,...,Xjny-.-Xmn) Of every n-th participant 


Xn®QVN (6) 


Since all data series (X1;,....Xjns---%mn) pertain to the same population, they 


should be considered, at least a priori, as homogeneous and, consequently, it is 
M 
appropriate to pool the local samples into a single sample (of total size $ nm) 


m=) 
and to estimate an expected value of the resulting super-population. Summary 
operations in each Laboratory, i.e., the definition of a new stochastic variable Y,, 
combining the local samples to estimate the local summary parameters and its 
U,, should be avoided since they would improperly introduce a hierarchy. 

Q is distributed according to a distribution function 4% while each local 
probability distribution 7% of the n-th population is sufficient to characterize the 
local stochastic variability of the n-th population. ÉA) can be defined as the 
mixture of density functions, or mixture model of the super-population [6]: 


N N 
fx, A)= >, 75.065 An), i l, n=l,...,N (7) 


n=l n=) 
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each SA) is the density function of 7; 7, > 0, where 7, are the proportions. 
The distribution function 7 is the compound distribution of the N components 
Sni 

The description of the overall data variability with a single pdf /does not 
require further assumptions, except those embedded in the identification of the 
local density functions 7. Thus, (7) directly represents the overall variability of 
the physical state realisation, as resulting from the IC. An example of a bimodal 
distribution is shown in Fig. 4 [8]. In this case the metrologist has guidance for 
further inference about the data analysis [12]. In less evident cases, he can apply 
statistical tests to decide if the distribution is significantly different from a uni- 
modal one or from a given class of distributions, e.g., Normal, A 





eo age e 2 
Temperature ATimK 
Figure 4. Mixture density distribution for a 1984 inter-comparison (argon). 


In any instance, from the mixture distribution function, the IC summaries can be 
obtained in a straightforward way. The value called Reference Value (RV) in 
the Key Comparisons (KC) of the MRA [9] is the distribution expected value, 
the numerical value r: 


r= EA) (8) 


The computation of (8) can be performed analytically or numerically. 


The value of the Degree of Equivalence (DoE) of the KC, anr, for the n-th 
participant is: 


Anr= ERa) -r (9) 
4.1. Evaluation of a KC uncertainty 


The discussion of the problem of evaluating the uncertainty of a KC has so far 
been essentially associated to the discussion of the Reference Value (RV), 
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considered as a necessary step to evaluate the uncertainty of the DoE required 
by the MRA. This might not always be appropriate. Here the possible problems 
are summarised, which deserve a much more comprehensive analysis. 


4.2.1 The RV as a deterministic parameter (DRV) 


The RV is not necessarily a random variable. A Deterministic Reference Value 
(DRV) can be defined, which is a numerical value, r, with no uncertainty 
associated to it. Irrespective to the method used to obtain it, the DRV is a purely 
stipulated value, as opposed to a statistically-generated one. Its purpose is 
limited to the computation of the DoE, where it determines the deviation of the 
measurand value of every participant from the RV. 

This kind of RV definition is not in contrast with the MRA requirements 
and can have the advantage to avoid the problems of a critical choice of the 
summary statistics, when its meaning or use is not essential to the use of the 
inter-comparison results. In addition, it avoids the problems arising from the 
correlation of the RV with the participants’ values, as no uncertainty is 
associated to the DRV. Only the Degree of Equivalence (DoE) has an associated 
uncertainty, as required by MRA [9]. A DRV has been preferred for temperature 
key comparisons CCT K2 and CCT K4 [10,11]. 

In the case a DRV is used, the estimate of the uncertainty for the DoE of the 
n-th participant is simply: 


lig Uy We (10) 


This would also be the uncertainty to associate to the DoE in (9). 


4.2.2 The RV with an associated uncertainty 


However, most KCs do associate an uncertainty to the RV. When several local 
random variables are defined (class a) standards), the RV is a random variable, 
whose uncertainty is estimated according the usual statistical methods. 

When all the local data series pertain to the same Q (class b) standards) 
there are at least two possible ways to estimate the uncertainty: 

e from the statistical variability of Q, as represented by A Usually, the 

second moment of the distribution is taken as the measure of u. 

e from the set of the local uncertainties {un} provided by the participants. 


The first way to define the uncertainty certainly provides the usual description 
of the to-date variability in the current knowledge of the physical state (e.g., 
temperature value at the thermodynamic state called triple point), if the KC 
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exercise is comprehensive, e.g., comprises most worldwide-available 
realizations of the physical state. However, it is also evident that, by accepting 
all results as valid and with no screening of the participants’ quality —prevented 
by the MRA,— the second moment of F can be dominated by the uncertainty of 
the participants of lesser quality. This situation becomes extreme for an inter- 
comparison including only one very accurate and one very inaccurate 
Laboratory. 

Therefore, the second way to define the KC uncertainty might be more 
suitable to picture the actual average quality of the participants, especially in 
top-accuracy metrology, where the u, are dominated by Type-B uncertainties 
and consequently u, represent more the overall quality of each Laboratory than 
the variability of its x,. 

The set {u,} of local uncertainties needs be summarised. In the case of a 
KC with many participants, the median, acting like a kind of “majority rule” of 
the average participant measurement capability, could be a suitable choice: 


u, = median(u,), n= 1...N, _F(u,) = F (median) (11) 


On the other hand, this would not be the case for an inter-comparison with 
only two participants. Alternatively, the metrologists could resort to some kind 
of consensus value for the KC uncertainty, as opposed to one purely based on 
sound statistical reasoning. 


4.2.3 No RV defined 

It is a peculiar case allowed by the MRA in special cases. Obviously, only the 
DoE between each pair of Laboratories can be defined and its uncertainty 
evaluation —and the KC uncertainty evaluation— yet requires a different kind 
of reasoning. However, does the MRA really allow for a non-uniform way to 
evaluate the uncertainty of different KC’s —-hence of the international 
equivalence— only because of a different choice of the RV? 


5. Conclusions 


The paper introduces the common problem in measurement science arising 
when the treatment of data series needs to take into account both their common 
and their series-specific characteristics. These problems find a different solution 
in methods suitable for different types of data series. 

Two types of metrological data series have been discussed, for which in 
recent years mathematical and statistical tools have been adapted to the 
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metrological field. The first is when the data series need to be fitted to a single 
regression model in a range of values of the independent variable (e.g., data 
from different sources or from an unstable standard): the LSMFE solves the 
problem in a suitable way. The second is when each data series is summarized 
according to its statistical variability and all data series pertain to the same 
population, though arising from different sources: the use of a single (mixture) 
distribution function to describe the overall variability of the pooled data solves 
the problem and provides a straightforward definition of the required statistical 
parameters. In the latter case, the peculiar problems of defining an uncertainty 
for the overall data is introduced and will deserve further studies. 
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This paper presents a new homotopic algorithm for solving Elementwise- Weighted 
Tota)-Least-Squares (EW-TLS) problems. For this class of problems the assump- 
tion of identical variances of data errors, typical of classical TLS problems, is 
removed, but the solution is not available in a closed form. The proposed itera- 
tive algorithm minimizes an ad hoc parametric weighted Frobenius norm of errors. 
The gradual increase of the continuation parameter from 0 to 1 allows one to over- 
come the crucial choice of the starting point. Some numerical examples show the 
capabilities of this algorithm in solving EW-TLS problems. 


1. Introduction 


In different scientific areas, various problems are formulated as overdeter- 
mined systems of linear equations of the type 


Arb, (1) 


where the entries of A € R”*”, with n < m, and b € R” are the noisy 
data of the problem and z € R” is the unknown vector; hereafter, set 
T = {1,2,...,m} groups the indices of scalar equations embedded in Eq. 
(1) and set J = {1,2,..., n} groups the indices of unknowns. 

In the Ordinary Least Squares (OLS) problems, independently dis- 
tributed (i.d.) errors are assumed to affect only vector b. Methods for 
estimating the effect of this assumption on the OLS solution are given in !. 
The Total Least Squares (TLS) problems, introduced in *, take into account 
i.d. data errors in both A and b, under the assumption of identical variances 


*Work partially funded under EU SofTools.MetroNet Contract N. GGRT-CT-2001-05061 
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(i.i.d. errors). Their solutions have been found in a closed form, by using 
the Singular Value Decomposition (SVD), as pointed out in 3, thus avoiding 
the use of an iterative algorithm. A full review on general TLS problems 
can be found in *. The case opposite, but less popular, to the OLS problem 
is the Data Least Squares (DLS) problem °, where i.i.d. errors affect only 
the m x n entries of A. 

Unfortunately, in many problems A and 6 represent various physical 
quantities and, therefore, are measured with different accuracy. As a conse- 
quence, entries a,, of A and b, of b are affected by independently distributed 
(i.d.) errors, Aa,, and Ab,, with zero averages and different variances an, 
Va € Ix J and o% V¥16¢€ TZ. Hereafter, the corresponding standard 
deviations are cast in the following matrix and vector 


O11 012 ... Oln O1b 

021 022 -.. O2n O2b 
pa — a and o=}| .. 

Omi Im2---Omn Omb 


respectively. These problems are called Elementwise- Weighted Total- Least- 
Squares (EW-TLS) problems ê and result to be of wide interest in many 
measurement applications, as evidenced in 7 and 8. 

A solution in a closed form is not known for generic EW-TLS problems. 
This paper proposes an original — to authors’ knowledge — and robust iter- 
ative algorithm solving these problems. It is based on the minimization of 
the weighted Frobenius norm (WFN) of errors °, that results to be equal 
to the sum of ratios of suitable quadratic forms versus x. This objective 
function is convex constrained to a finite subregion including the expected 
minimum, and requires an iterative algorithm to find the solution. How- 
ever, the choice of a good starting point is critical, because the boundary 
of the convexity subregion cannot be easily determined. 

In the here-proposed algorithm, the original WFN is preliminarly ex- 
panded into a parametric family of functions by introducing a suitable con- 
tinuation parameter 7 € [0,1] and approximated by the truncated second- 
order Taylor formula. Initially the minimum corresponding to 7 = 0 is 
determined in a closed form as the problem reduces to the OWLS one and, 
then, exploited as the starting point to determine the minimum of the ob- 
jective function for 7 close enough to 0. Then, according to a homotopic 
strategy, 7 is gradually increased up to 7 = 1, to find the minimum of 
the original WFN. In other words, the algorithm follows, step by step, the 
movement of the minimum versus 7 in order to center the solution of the 
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original EW-TLS problem. At the k-th step the minimum of the objec- 
tive function is determined by few Newton (algorithm) sub-steps, based on 
its truncated Taylor formula in the neighborhood of the solution x* of the 
previous substep. 

When a non-definite positive Hessian matrix is met in the Taylor for- 
mula, the iterations of Newton algorithm are stopped and the algorithm 
is restarted with a lower value of 7 to keep the current solution inside the 
convexity region. In this way, the critical choice of the starting point is 
overcome. 

Numerical results show the capabilities, in particular the robustness and 
the local convergence, of the algorithm. Unfortunately, a proof showing the 
global convergence is not available. However, some numerical tests of the 
algorithm on specific EW-TLS problems, whose global minimum is known, 
show the coincidence of the solution with the global minimum. 


2. Problem statement 
As in ®, we introduce the WFN of data errors on A and B 
F(a, AA, Ab) = X_ (Aa? S'a, + Abog) , (2) 


ET 

where Aa, = [Aa Aa ... Aan], Ab = [Ab, Abs ... Abm]! and S, = 
diag (03, 0h, .-.,02,) (V 2 € T). Solving the EW-TLS problem consists in 
finding the optimal values ž, AA and Ab of z, AA and Ab, respectively, 
by minimizing F(z, AA, Ab) 

min F(z, AA, Ab) subject to (A—AA)xr=b—Ab. (3) 

(x, AA, Ab) 

Note that ž, AA and Ad are invariant with respect to any arbitrary scaling 
factor in the variances. When o, = 0, the corresponding error Aa,, is 
excluded from the WFN in Eq. (2). If S, = 0 and oĉ, = 0, the -th scalar 
equation in Eq. (1) becomes exact and “~” must be replaced by “=”. 
The possible equations with S, = 0 and gł = 0 play the role of “exact” 
constraints on x. In principle, these cases can be treated as limit cases of 
the EW-TLS problems, but their solution requires a different formulation, 
and, therefore, their investigation is out of the aim of the present paper. 


By choosing specific structures of © and c, the above formulation of EW- 
TLS problems includes : 


(1) Ordinary Least Squares (OLS) problems : when o,, =0V17 € Ix J 
and c, are nonzero and invariant V 2 € T. 
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(2) Ordinary Weighted Least Squares (OWLS) problems : when o,, = 0 
¥27€Z x J and ow are nonzero and depending on 2 € T. 

(3) Total Least Squares (TLS) problems : when o,, V 17 € Z x J and 
C, V1€Z are nonzero and coincident. 

(4) Data Least Squares (DLS) problems : when c, V 17 E T x J are 
nonzero and coincident and o,, = 0 V2 € Z. 

(5) Relative Error Total Least Squares (RE-TLS) problems ê : when 
Ory X || VJ ETX J and ow x |b,| V2 € Z. 

(6) Weighted Total Least Squares (WTLS) problems ? 1° : when the 
variances verify the relations o7,= o» 0o, V19€ Tx J. 


On the other hand, EW-TLS problems can be regarded as particular 
Structured-TLS problems ê 1. However, in the latter the number of uncer- 
tain free parameters is upper constrained by the number of the m x (n+ 1) 
elements of system (1) and is, in practice, much fewer than in a generic 
EW-TLS problem. 

With reference to Eq. (3), we substitute Ab = AAx—Azx-+b6 into 
F(x, AA, Ab) and obtain a new expression of the WFN 


F(x, AA) =~ [Aars*Aa, + (zTAa, ~ 27a, +b.) oy" . (4) 
1EL 


To minimize the above WFN, we determine, for a given value of z, the m 
gradients of F(a, AA) with respect to the unknown n-vectors Aa, Aa, 
.., Aa 
3 m 


OðF(z, ^A " = 
aa =2 |s; l ANa, T (x1 Aa, = ra, = b) oz V:ET. 
Setting them to zero, we obtain the following m systems of n algebraic 
equations in the n-vectors Aa, Aag, ..., Aam 
S trr or Aa. = aa) a T Vrea (5) 


Any matrix in Eqs. (5) coincides with the sum of a diagonal matrix and a 
rank-one matrix, and its inverse is given by the closed form 


S, x |S, z]! 


St +a ata;,’ Oe a Ce ee eT 
| LEL Op | (xT A 2 +04) 
So, the solutions of systems in Eqs. (5) are 
[ae Q, Tei b, ) 
Aa, = ey he EL- (6) 
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Hereafter, we assume that zT S, x + 0% > 0, for any 2 and Vz in a neigh- 
borhood of the minimizer of F(x, AA). In practice, this limitation is not 
heavy. 

By substituting the values Aa, in Eqs. (6) into F(z, AA) (see Eq. 4), 
we obtain a new expression of the WFN, depending only on z 


ra, = b, ‘ T 2 (xT Q, — b,)? 

i 2 Gad (2° S.2 + o) = > at S,xt+o%, ` 4 
Then the minimization of F(z, AA) in Eq. (4) can be substituted by the 
less troublesome minimization of F(a) with x € R” 

min F(z) . (8) 

(£) 
Note that terms (aT z—b,)? and (x? S, x+02,), 1 € T, are non-negative and 
analytically continuous. If c, > 0 the corresponding term (x? S, x + 0%) 
is positive Vz € R” and F(x) is finite and analytically continuous Vz € R”. 
If o = 0 the corresponding term (x! S, z + øg?) can be 0. For the values 
of x zeroing one of these terms, F(x) is analytically discontinuous and 
tends to +00; for the remaining values of x, F(x) is finite and analytically 
continuous. 

Since problem in Eq. (8) cannot, in general, be solved in a closed 
form, an iterative (homotopic) algorithm is designed taking into account 
the peculiarities of the objective function. To obtain a fast convergent 
algorithm, we assume F(x) to be strictly convex in the neighborhood D* C 
R” of a given point «* and approximate it by by the 2nd-order truncated 
Taylor formula F(2*+Az) ~ F(x), with Ax = x—-z*,V x € D*. Under the 
convexity assumption, a stationary point of F(a) (defined by the nullity of 
the gradient) is a local minimum too. This assumption implies the positive 
definiteness of the Hessian matrix. 

Since F(x) in Eq. (7) is not, in general, convex V x € R”, the choice of 
the starting point z* is crucial to obtain an efficient algorithm. However, 
there exists a convexity region Ry C R”, including the minimizer Žž, over 
which F(x) is strictly convex. Unfortunately, the boundary of Ry is not 
easy to be determined. 

To overcome this drawback, we introduce a continuation parameter 7 
and expand F(z) in Eq. (7) into a parametric family F(2z,7), called Para- 
metric Weighted Frobenius Norm (PWFN) 


z Ss (a? a — b}? ~ Olz) , 
F ——— tii — en . 
(z, n) a n yT S, T an a4, = D, (2, n) with n = [0, 1] (9) 
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Since both the minimizer ž and the convexity region Ry vary with respect 
to n, in the sequel we will denote them by Z(7) and Rz(7), respectively. 
The curve described by #(7) in R”, by varying 77 in [0,1], is called minima 
locus, while the region described by R,z(n) forms the convexity funnel in 
the Rt! space spanned by z and n, including the minima locus. 

By examining F(x, 7) in Eq. (9), we observe that F(x, 0) corresponds to 
the WFN of the OWLS problem derived from the original EW-TLS problem 
by assuming that all the entries of A are error-free, (i.e., o, = 0, Y 1,3 € 
T x J). In this case, (0) can be determined in a closed form and R,(0) 
coincides with R” , because F'(x,0) reduces to a quadratic form in x. 

By varying 7 from 0 to 1, F(xz,7) represents the WFN of EW-TLS 
problems in which the variances of entries of A are proportional to those 
of the original EW-TLS problem in Eq. (2), ie. n o2, Y 1,79 €T x J. For 
n = 1, F(x,1) coincides with F(x) in Eq. (7). 

Moreover, the convexity region R,(7) contracts from R,(0) = R” to a finite 
region in R,(1). If ž(ņ*) € Rz(y*), for the sake of analytical continuity, 
we have %(n* + An) E€ Rz(n*) for An small enough. 

Note that, while the points on the surface delimiting the funnel are 
characterized by the positive semidefiniteness of the Hessian, the points 
within the funnel are characterized by the positive definiteness of the Hes- 
sian and the Hessian matrix can be decomposed into the product UT U, 
where U is an upper triangular matrix. The possible non-existence of fac- 
torization UT U implies that the matrix is neither positive definite nor 
positive semidefinite; thus the current solution x is external to the funnel. 


3. The convexity funnel algorithm 


In this section we derive the Newton algorithm to minimize the PWEN 
F(z,n), for a given 7. To this aim we introduce the following nota- 
tions : VzF(z,n) and VTF (z,n) are the column and row gradient vectors 
of F(x,n) versus x, while V,ViF(z,7) is the Hessian matrix of F(z, n) 
versus x. Note that in these definitions the continuation parameter 7 is 
kept invariant. The second-order truncated Taylor formula F(x* + Az, n) 
approximating F'(z,7) in the neighborhood of a generic point x* is given 
by 


F(a2* + Az,n) = F(2*,n) + VIF (2*,n)Ax +1/2 Ac? V,VIF(2*,n)Az , 
(10) 
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where 
7 m 9 (al z" — b.) A 2n (a; x* = b,) T 
VIF r = a TER GE Val 51 i 
z P'(2*,n) 2 nat? S,0* +035 2 (gar SE" A 
(11) 
and 
_ 2 2 m 2n (alr =b) 
V, VIF un — ee oP aera ae ay ra eee a 3 
z E(2*,n) 3 nut? S,r* +03 aT (n TE, z* + oa) l 


m Tak 
= `. A id (a; z bs) 5 ea al + a,2* S,] 
" (n a*? S, x* + aa) 


m 2 {nT »* 2 
Do Caa r*z*T S, . (12) 

T] (n 2*7TS,2* + 02) 
The second-order Taylor formula in Eq. (10) is exploited to realize a pure 
(quadratically convergent) iterative Newton algorithm. If x* is a point close 
enough to the searched minimum, each single Newton sub-step consists in 
determining an increment Ax = x — x* under the assumption that the 


Taylor expansion be good enough and that the objective function be strictly 
convex in the neighborhood of 2x”. 


Vack (a* + Az,n) = VeF(2*,n) + VaViF(2*, Ar . (13) 
Each stationary point is given then by 
Tf 1 * -1 DN a 
Ag = —[V,V,F(2*,n)| Yaf . (14) 


If F'(x, 7) results to be convex at x = x* + Az, we set x* = x* + Az and ex- 
ecute a further Newton substep. Otherwise, we stop the Newton algorithm 
and restart it, by choosing a lower value for 7. This trick is fundamental 
in order to mantain x* within the convexity funnel. More precisely, let n* 
be the highest value of 7 for which we know the corresponding minimum 

x** = ž(ņ*) and ņa > ņ* be the value of n for which the Newton algorithm, 
starting from z**, does not converge. We repeat the Newton algorithm, 
starting again from x**, but choosing a lower value m € (y*,7,) . This de- 
crease of 7 makes easier the convergence of the Newton algorithm because 
the searched minimum ž(m,) is closer to x** than the previous #(7,). If the 
Newton algorithm does not converge also for n = nb, Np is further decreased 
until a value m > n* assuring the convergence is obtained. The success of 
this homotopic procedure is assured by the analytical continuity of ž(n), 
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due to the analytical continuity of F(x,7) for Vx near to its minima. So, 
we can set n* = m and Na = 7* + An and repeat the procedure with a 
higher values of 7* . 

The choice of An is crucial : for An small enough, the convergence of the 
Newton algorithm is assured, but many increments of 7 can be necessary 
to reach the final value 7 = 1. On the other hand, for An large enough, 
few increments of 7 can be sufficient in principle, but the Newton algo- 
rithm cannot converge. To realize a good trade-off, we choose the starting 
point of a Newton algorithm for a given 7 according to some extrapolation 
techniques. 

If we know a set of minimizers ž(nv), (M41), .--, £(n,) corresponding 
to the set of values 0 < ny < M41 <... < Np, thus, we can estimate the 
value of (7,41) corresponding to an assigned value 7,41 > Ny by extrap- 
olating the known values Z(n,), ž(nv+1), .--&(qy). The extrapolation is 
obtained by adopting n simple functions, one for each entry of x. These 
extrapolation formulas allow one to choose the starting point for determin- 
ing the minimixer £(7,,41) so accurately that the Newton method supplies 
the true value in a few iterations. 

The initial value of 7 can be chosen on the basis of experience. In the 
next section we determine it by an empirical formula. However, this value 
is not at all critical because it is varied by the algorithm itself when the 
Newton algorithm does not converge. 

The convergence of the Newton algorithm is controlled by distance 
| Az] : for ||Az|| < e?||z*|| we ascertain that x* is an equilibrium point 
of F'(z,7) and the positive definiteness of Hessian matrix V,Vi F(2*,7) = 
UT U assures that x* is inside the convexity funnel and, hence, it is a 
minimum of F'(x,n). In the algorithm we introduce the desired relative 
accuracy €o of partial minima (i.e, for 7 < 1), and the accuracy €; < €o of 
the final minimum (i.e, for 7 = 1). 

If the distance ||Az*|| between the minima related to the k,-th value 
of 7 and the (ky — 1)-th one is small enough, the next value of 7 is set to 
1. Otherwise, some strategy has to be determined in order to improve the 
speed of the algorithm. Let €z be the accuracy of the distance to distinguish 
between two successive minima, ky is the counter of the increments of 7 and 
kz is the counter of the substeps within a single Newton algorithm. 

The main computational effort is required in calculating and factoriz- 
ing the Hessian matrix. Several multiplications and additions of matrices 
and vectors are involved. However, the matrices in products are rank-one 


(matrices a, a}, £ £T, a, £T, x a? in Eq. (12)) or diagonal (matrices 
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S1, S2, ...,;Sm in Eq. (12)), due to the structure of the objective function 
F(z,n) in Eq. (9). Generic non-sparse matrices are involved only in less 
expensive summations versus 2. This fact makes the calculation of the Hes- 
sian less CPU expensive than that of a generic R” — R function. The LU 
factorization of the Hessian matrix is realized by the standard Cholesky 


algorithm. 


4. Numerical results 


All examples are concerning with specifically noisy matrix A and vector 
b derived from a unique error-free 10 x 2 matrix and error-free vector, by 
adding random errors to all entries. The initial value of Ay starting the 
algorithm is determined by the empirical formula 


o? 1 
. st 
An = min =t 


(2) es ož, 


In all examples the accuracy required for stopping the intermediate iterative 
Newton algorithm is €9 = 0.05, while the accuracy for stopping the overall 
algorithm is e1 = 107+°; eg = 0.05 is the accuracy for the distance between 
minima related to two successive increments of n. 


The examples are characterized by the same standard deviations of the 
respective errors 


4.0000 0.1000 0.0100 
3.5667 0.5333 0.0100 
3.1333 0.9667 0.0100 
2.7000 1.4000 0.0100 

y — | 22667 1.8333 | -aq o, _ | 0.0100 
1.8333 2.2667 b = | 0.0100) ° 
1.4000 2.7000 0.0100 
0.9667 3.1333 0.0100 
0.5333 3.5667 0.0100 
0.1000 4.0000 0.0100 


whereas data matrix A and data vector b are different. This induces the 
same starting value An = 0.00000625 but the behaviour of the algorithm 
results to be rather different. 
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A= 


For 


2.59682756412627 9.14729111791493 67.008389261 
6.82215678878727 9.19327278015292 63.003350575 
3.46230227137220 2.74926114730709 26.995457326 
3.20353427391860 1.66816071164268 2.997805620 
10.89991665143950 — 1.61023973136850 TE TE 9.001552853 
—0.37203854346990 6.13279826630644 | ° -~ | 42.990951419 
4.65527261492859 4.55510312153970 51.989789658 
0.65763908669889 10.76259377967540 56.995640572 
3.06849059794170 7.85144473997031 46.013382142 
2.97487063693829 13.19952354957530 24.010696135 


the algorithm converges in the predicted way as shown in Table 1. 


Table 1 shows the numerical results illustrating the behavior and perfor- 
mances of the algorithm. In particular Table 1 shows the starting point 
x and the obtained point x* for each Newton substep. The last column 
reports the squared modulus of the difference between the minima related 


to two successive increments of 7. 


au 
2 
3 
5 
7 
9 
10 
11 
12 





TABLE 1 


(kn, ka) ji [Az*] 
OWLS 2.108078099 | 4.539951373 |  - 


0.00000625 | 1.772811024 | 5.878032581 
1.621103047 | 6.261642009 
1.659813180 | 6.233089613 | 0.124E+01 

0.00002498 | 1.771939925 | 6.193504350 - 


















0.00008120 | 1.806730504 | 6.183439118 
1.806938644 | 6.183581201 | 0.247E-01 


1.00000000 | 1.822081254 | 6.177798235 
1.822130963 | 6.177825737 - 
1.822130968 | 6.177825736 - 
1.822130968 | 6.177825736 | 0.115E-01 



















The initial value of An is small enough and four increments of 7 are 


required. On the contrary, for 


4.67143070625183 8.90028320737129 67.002621315 
12.49822068903150 8.27703510323791 62.997016898 
6.93862923994614 3.61443195076070 27.006431377 
3.40768477129794 0.14275866741392 2.982733880 

KE 8.80844351480653 1.01201437135240 E ee 8.991048680 
3.73390581806528 2.50419998991749 43.008023017 
2.27239857108500 5.85511824073525 52.004382496 
1.30501094112949 4.87872627074283 56.987487133 
4.68577780114022 3.45851362432531 46.007503142 
2.85242249761820 —3.69955579089445 24.026246284 


the initial value of An results to be too large. In fact, not only for 7 = 
0.00000625 but also for 7 = 0.00000125 the convergence is not reached 
because x exits the funnel, as shown in Table 2. However, the convergence 
of the Newton substeps is obtained for 7 = 0.00000025. Thereafter, seven 
increments of 7) are required to reach the final minimum. 
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TABLE 2 


(kn, ks) Az" || 
OWLS 1.769743953 | 6.378399634 | - ss 


ee 
2 (1,1) 0.00000625 | —28.572823559 | 23.451717413 
ie + 2 cues Newton algorithm stopped 
(1,1) 0.00000125 —1.626309116 | 8.889327409 - 
js} aa 5.577592673 | 3.277287639 - 
(1,3) Newton algorithm stopped 
(1,1) 0.00000025 —0.126649107 | 8.397578230 
H (1,2) —0.062206462 | 8.450164555 
9 (1,3) —0.062144879 | 8.450585216 | 0.196E+01 
(2,1) 0.00000100 —0.242721491 8.257340967 
11 (2,2) aes —0.227987309 8.251298262 






















12 | (3,1) | 0.00000325 | —0.015263290 | 7.784231228 p 
13 | (3,2) —0.112771278 | 7.890215060 2 
14 | (3,3) Dai —0.118013663 | 7.895949231 
15 | (4,1) | 0.00000999 0.134357432 | 7.532394610 F 
16 0.002995922 | 7.654061812 
17 —0.002644704 | 7.659147132 | 0.186E-+00 
0.00003023 0.091304845 | 7.526564659 z 
0.00009094 0.080115942 | 7.511082640 z 
a| 62) | oo Giosoogozor | 7510709438 | 0.320-01 


22 1.00000000 0.083490152 7.494310985 
0.092464231 7.489498343 - 
0.092426145 7.489521438 - 
0.092426145 7.489521437 - 
0.092426145 7.489521437 | 0.173E-01 
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5. Conclusions 
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The here-proposed convexity funnel algorithm, implementing a homo- 
topic strategy, solves iteratively general Total-Least-Squares problems, in 
which errors are assumed to be elementwise independently distributed but 
with different variances (the so-called Elementwise- Weighted Total-Least- 
Squares problems). It is based on the introduction of a parametric weighted 
Frobenius norm of the errors. The crucial choice of the starting point for 
the minimization of this norm is overcome by gradually increasing the con- 
tinuation parameter appearing in the norm. All examples prove that the 
algorithm is locally convergent and, in the special case of TLS setup, the 
EW-TLS estimator coincides with the corresponding TLS estimator. 

In this paper, no rigorous proof assures that the solution supplied by 
the funnel algorithm is the global minimum of the weighted Frobenius norm 
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of errors. However, the confinement of the current point within the funnel, 
during the whole procedure, suggests the coincidence of the solution with 
the global minimum. 


Some tests on realistic problems show the capabilities of the algorithm. 
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Measurement comparison data sets are generally summarized using a simple 
Statistical reference value calculated from the pool of the participants’ results. 
This reference value can become the standard against which the performance 
of the participating laboratories is judged. Consideration of the comparison 
data sets, particularly with regard to the consequences and implications of such 
data pooling, can allow informed decisions regarding the appropriateness of 
choosing a simple statistical reference value. Recent key comparison results 
drawn from the BIPM database are examined to illustrate the nature of the 
problem, and the utility of a simple approach to creating pooled data 
distributions. We show how to use detailed analysis when arguing in favor of a 
KCRV, or when deciding that a KCRV is not always warranted for the 
particular data sets obtained experimentally. 


1. Introduction 


Among the many tasks facing international metrology today is the conduct and 
analysis of key comparisons. The goal is to quantify the degrees of equivalence 
among realizations of the system of units and the primary measurement 
techniques in each field.* These comparisons are summarized in published tables 
that relate each participating laboratory result to a Key Comparison Reference 
Value (KCRV) that is, in general, an aggregate statistical estimator such as the 
mean (weighted uniformly or otherwise) or the median of the quoted laboratory 
results (both the values and the standard deviations).° 

We present a simple technique for displaying and interpreting the 
population of results, and for evaluating different statistical approaches to 
determining when and when not to use a KCRV that is calculated from the 
participants’ data sets. This work amplifies the data pooling approach discussed 
in [1], and is similar to the mixture distributions explored in [2]. 


* Available at http://www.bipm.org/pdf/mra.pdf 
> Available at http://kedb.bipm.org 
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2. Data Pooling and Reference Values 


In many statistical approaches to data analysis, the initial assumption is to 
consider that all of the participants’ data represent individual samples from a 
single population. A coherent picture of the population mean and standard 
deviation can be built from the comparison data set that is fully consistent with 
the reported values and uncertainties. Most outlier-test protocols rely on this 
assumption to identify when and if a given laboratory result should be excluded, 
since its inclusion would violate this internal consistency. 

Creating pooled data distributions tackles this problem from the opposite 
direction: rather than assuming a single (normal) distribution for the population, 
the independent distributions reported by each participant are summed directly, 
and the result is taken as representative of the underlying population of possible 
measurements reported by randomized participants. 

When all participants’ uncertainties are normal, it is straightforward to 
combine them and obtain expressions for the distributions of the population and 
for the simple and variance-weighted means. If finite degrees of freedom are 
provided, it can be more complex to combine the appropriate Student 
distributions analytically. The Welch-Satterthwaite approximation is often used 
to determine the effective degrees of freedom for a combination, for example. 

Rigorous analysis in closed form is not always feasible. The standard 
uncertainty of the median, for example, has no closed form. Instead, the median 
absolute deviation (MAD), scaled by a multiplicative factor derived from 
normal statistics, is often taken as a working estimate of the standard deviation. 

In general, a numerical approach to data pooling that is easy to use and fast 
to compute can provide metrologists with a useful visual and statistical tool to 
investigate many aspects of the comparison population and its estimators of 
central tendency as revealed in the reported measurements and uncertainties. 


3. Monte Carlo Calculations 


One of the most simple and versatile numerical methods for exploring these 
distributions and related statistics is the Monte Carlo technique, which relies on 
sampling to represent repeated fully randomized measurement comparisons. 
Each Monte Carlo “roll of the dice” for a comparison event involves randomly 
sampling every participant’s reported distribution independently. In the 
language of bootstrap techniques, similar in implementation, this is “resampling 
without replacement”. “Resampling with replacement” is inappropriate, since 
events would result with some repeat participants and some left out altogether. 
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3.1. Random Number Generation 


Most texts on numerical analysis [3, 4] provide fast algorithms for generating 
uniformly distributed pseudorandom numbers. These generators provide streams 
with long or very long periodicity (in the range 2” to 2°”), excellent uniformity 
and statistical properties, and can be seeded to allow repeated calculation using 
the same set of random numbers in different runs. 

Transformation from such uniformly distributed random numbers to 
variates obeying any desired probability density function (PDF) can be effected 
using the corresponding cumulative distribution function (CDF). In this 
‘technique, the normalized CDF is computed for each participant from the 
reported mean values, standard deviations, and degrees of freedom for the 
appropriate PDF. Each uniformly distributed random variable may be 
transformed using a simple lookup procedure: take the uniform random number 
as the cumulative probability, and then the required variate may be read off of 
the abscissa directly or computed using linear interpolation between the 
bracketing pair of values. Figure 1 shows the both the PDF and CDF for a 
Student distribution with 4 = —1, o = 2, and v= 4 as an aid to visualizing the 
transformation algorithm. 


0.20 
0.18 
0.16 
0.14 
0.12 
0.10 


Student PDF 





-10 -5 0 5 10 
x 


Figure 1. Plot of Student probability density function (dark line, left axis) and cumulative 
distribution function (light line, right axis). CDF(x) is uniformly distributed, and may be 
used to transform uniform random numbers onto the Student (or any other) distribution. 


We have found that calculating 10° points to describe each participant’s 
CDF out to 7 or so standard deviations from their mean gives enough numerical 
resolution for evaluating comparison data. Far fewer points are required in the 
distribution histograms: sufficiently detailed curves can be obtained using 1000 
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bins. In this way, smooth graphs for the pooled distribution and the distributions 
of the simple mean, the variance weighted mean, the median, and many other 
simple estimators of central location can be generated using 10° or 10’ events. 


3.2. Toolkit Implementation 


Our Excel Toolkit [5] has been extended to include macros that compute and 
display distributions.” For a multi-laboratory comparison, measurements, 
uncertainties, degrees of freedom and inter-laboratory correlation coefficients 
are entered on an Excel worksheet. Toolkit macros make it simple to plot the 
individual participants’ results, as well as the pooled distribution revealed in the 
comparison. Monte Carlo distributions of several common candidate reference 
values can also be calculated and plotted: the uniformly weighted (simple) 
mean, the inverse-variance weighted mean, and the median. 

The Visual Basic for Applications (VBA) macros can be called directly 
from Excel, but most computation takes place in an external Dynamically 
Linked Library (DLL). These math-intensive functions have been written in 
FORTRAN, validated using a separate code base independently written in C. 
Using the toolkit, it is easy to generate and display pooled-data and reference- 
value distributions for a comparison with ten participants: ten million Monte 


Carlo resamplings of the comparison take only a few minutes on an ordinary 
PC. 


4. A Worked Example 


4.1. Key Comparison Data Set 


As an example of the data pooling technique, consider the data in Table 1, 
extracted from Table 45 in the final report for the recent Key Comparison in 
Ultrasonic Vibration designated CCAUV.U-K1 in the BIPM database [6]. In 
this case, five laboratories performed an ultrasonic power measurement under 
reference conditions at 1.9 MHz. The laboratory results include their measured 
reference power (Pref), standard uncertainty (u), and degrees of freedom (v). 
The pilot has reported that no correlations between participants have been 
identified, so that no covariances need to be taken into account. Although the 
final report summarizes the degrees of equivalence normalized by the reference 
value, here we have reverted to the usual format using the measurand’s units. 


° Available at http://inms-ienm.nrc-cnrc.ge.ca/qde 
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These data are plotted in Figure 2. The Pilot determined that the data fail a 
consistency check that incorporates effective degrees of freedom, and therefore 
the median was chosen as the KCRV over the variance-weighted mean. 


Table 1. Key Comparison data for five participants in 
CCAUV.U K1: ultrasonic power at 1.9 MHz, low power. 


Lab Pree (mW) u(mW) 


PTB 97.4 0.84 
NIST 99.0 0.64 


NPL 97.6 1.01 
CSIRO 114.5 6.75 
NIM 94.0 1.16 
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Figure 2. Plot of Table 1, with error bars shown at 95% level of confidence. The 
reference value, taken to be the median (97.6 mW), is indicated by the dotted line. 


4.2. Monte Carlo Results 


The pooled data distribution computed from 10’ Monte Carlo events is shown in 
Figure 3, along with the distributions for the simple mean, the variance 
weighted mean, and the median. It is straightforward to extract the mean value 
and the 95% confidence interval for each of these distributions using the 
numerical results, as shown in Table 2. 

Several insights into the comparison can be obtained from Figure 3. The 
pooled data distribution is multi-modal, and does not justify assuming a normal 
population for outlier testing. Furthermore, given the similar distributions, it is 
unclear how choosing the median over the weighted mean as the KCRV has any 
meaningful impact on the comparison summary. The standard deviation of the 
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median distribution is significantly smaller than the uncertainty calculated in the 
final report using the weighted MAD function: 0.74 mW rather than 1.32 mW. 


Table 2. Candidate reference values for the data in Table ł. 


Distribution Distribution 95% Confidence 
Estimator Mean Std Dev Interval 
Mean 100.50 1.40 [97.15, 103.80] 
WtMean 97.75 0.42 (96.75, 98.70] 
Median 97.60 0.74 























Pooled Data 


Weighted Mean 


PDF 
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Figure 3. Plot of distributions derived from the data in Table 1. The pooled data is the 
sum of the individual distributions. The distributions for the simple mean, variance 
weighted mean, and median were calculated using Monte Carlo techniques. The KCRV, 
taken to be the instance median for this comparison, is indicated by the vertical axis. 


5. Advantages of the Monte Carlo Approach 


More important than its simplicity of use, the Monte Carlo technique offers full 
flexibility regarding methods to be studied. 

With Monte Carlo methods, the statistical properties of any algorithmic 
method may be computed by writing a subroutine that implements the 
algorithm. Any such statistic on each of the participating laboratories (or pairs, 
or subgroups) may be generated to summarize their place in the comparison. 
Beyond treating the combined uncertainties as normal or Student distributions, 
and the correlations as normal distributions, individual terms in participants’ 
uncertainty budgets, or the travel uncertainty of the artifact, may be assigned 
independent or shared random variates drawn from appropriate distributions. 
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5.1. Covariances 


The covariance between a candidate reference value method and a participant 
laboratory can be found by Monte Carlo methods. After 10% of the events, the 
mean values are known, and the variance and covariance (away from the means) 
can be computed with the last 90% of the events. Similarly, the covariance of 
the weighted mean with the median might be calculated. 

Where correlations between laboratories have been identified, the random 
numbers must be used with slightly greater care. For each simulated 
comparison, each laboratory value will be composed of its own unique 
independent part, and shared correlated parts. The combined uncertainty’s 
Student distribution is decomposed into normally distributed random variables, 
each shared with one other Lab, and an independent Student-distributed random 
variable with a reduced degrees of freedom (such that the Welch-Satterthwaite 
approximation, applied to the combined uncertainty will return the appropriate 
Student distribution for the combined uncertainty). Some extra care is required 
to deal with negative correlation coefficients, adding the random variate to one 
Lab, and subtracting it from the other. 


5.2. Outlier Rejection Schemes 


In some comparisons, outlier rejection schemes are identified in the protocol; in 
others, this question is addressed during data analysis. Since the consequences 
for any participant laboratory of being identified as an outlier can be severe — 
including rejection of its calibration and measurement capability listing in 
Appendix C to the Mutual Recognition Arrangement — it is useful to explore 
these questions as carefully as possible. 

Any algorithmic mechanism for outlier identification may be incorporated 
into the Monte Carlo simulation, and each laboratory may be assigned a 
probability of being so designated. Indeed, it is possible to track any comparable 
information in this fashion. For the example data set discussed in the previous 
Section, the probability of each participant being the median laboratory was 
found to be as follows: PTB (39%), NIST (17%), NPL (44%), CSIRO (0.5%), 
NIM (0.5%). It is interesting to note in particular that all participants, including 
the laboratory that failed the consistency check (NIM), have a non-zero chance 
of being the median laboratory. 


5.3. Other Candidate Reference Values 


The mean, weighted mean and median are three principal statistics used as 
estimators of central location, and considered as Key Comparison Reference 
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Values. At its last meeting, the CCT agreed to the use of a so-called “average 
reference value” (ARV) when determining which participants’ calibration and 
measurement capability claims related to CCT-K3 should receive further 
scrutiny prior to inclusion in Appendix C. In this particular context, the ARV is 
defined as the simple average of the usual three candidate reference values. 
Figure 4 shows these distributions determined from the CCT-K3 data at the 
triple point of argon [7].° The three usual estimators do not agree very well with 
each other, and do not provide a single reference value. The shape of the median 
distribution is interesting: it is very wide and flat: many participants have 
approximately equal chances of being a median lab. The ARV distribution is 
slightly skewed, a combination of the three distributions and their correlations. 


PDF 





-0.50 -0.25 0.00 0.25 0.50 
AT (mK) 


Figure 4. Plot of CCT-K3 distributions at the triple point of argon. The ARV (dark line) 
is the simple average of the mean, weighted mean, and median (light lines). 


For CCT-K3, the technical experts agreed that no statistically meaningful 
KCRYV could be assigned, since the pooled distribution revealed a complex 
underlying population with no simple estimator of central tendency. It is 
difficult to nominate a KCRV when different estimators each tend to favor or 
disfavor different participants. The simple linear combination to form the ARV 
is intended solely to serve as a single point of reference when identifying 
Appendix C claims that require further scrutiny. Such linear combinations of 
estimators of central tendency have been proposed and discussed elsewhere in 
the context of robust statistics [8]. The ease with which the Monte Carlo 
calculation yields the distribution for the ARV is illustrative of the power of this 
method, and may serve as encouragement to explore other, perhaps more 
complicated, estimators. 


d Available at http://www.cstl.nist.gov/div836/836.05/papers/TN 1450. pdf 
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6. All-pairs Difference Distribution 


Plotting the distributions for pooled data and estimators of central location 
enables visual insight into the comparison data set. To go beyond this qualitative 
level and quantitatively address the issue of when it is appropriate to nominate a 
KCRV calculated from the measurement results requires that the pair-difference 
distributions be calculated. 

Figure 5 presents such graphs for the data summarized in Table 1. Each of 
the light curves represents the comparison from the perspective of an individual 
participant. The sum of these per-laboratory perspectives is shown as the dark 
curve, which is the all-pairs-difference pooled distribution, which is symmetric. 


— All-Pairs-Difference 
Pooled Data 








PLO Co arbre 


-40 -30 -20 -10 0 20 30 40 
Measurement Difference (mW) 


Figure 5. Pair difference distributions for the data in Table 1. 


Similar to the exclusive statistics methods of [9], this pair-difference 
approach can be used to build an “exclusive chi-square” statistic for each 
participant by considering the appropriately scaled differences with respect to 


every other participant: x; = (N - 1)" > T y / (u; +u? — 2r uu;). 


1 


By averaging over all participants, we obtain a completely objective 
criterion that gives no special status to any laboratory’s results: the all-pairs 


x . £ N 
variance expressed as a reduced chi-square: y? = N Da ee 


This is the most general form of a chi-squared statistic for multi- participant 
comparisons. It is more powerful than the specific chi-squared test suggested in 
[10], which is tested against the weighted mean, since it is independent of any 
choice of KCRV: any constant shift applied to all laboratory values exactly 
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cancels out in this treatment. These quantities are shown in Table 3 for the data 
of Table 1. In each case, the probability of exceeding the computed value by 
chance is very small, indicating that the data set fails the chi-squared test, and 
that no single reference value can represent the underlying population. 


Table 3. Reduced chi-squares from each Lab, and the all- 
pair-difference x” - each with 4 degrees of freedom. 

PTB NIST NPL CSIRO NIM APD 
i 3.57 5.78 3.25 665 8.57 5.57 
6.5x10° 1.2x10* 1.1x10° 2.4x10° 6.5x107 1.8x10* 



















a 


Pr(y? > 
2 


7. Conclusions 


Monte Carlo methods have been exploited as a fast and convenient tool to create 
the distributions for any candidate reference value being considered as the 
representative statistic for summarizing a measurement comparison without the 
a priori assumption of data pooling in a single normal population. 
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The forthcoming Supplement 1 to the GUM: Numerical methods for the 
propagation of distributions proposes the use of Monte Carlo simulation for 
uncertainty evaluation. Here we apply a modified implementation of the 
proposed Monte Carlo Simulation to construct intervals of confidence for a 
complex-valued measurand. In particular, we analyze the so-called three- 
voltage method for impedance calibration, which relates complex-valued 
impedances to voltage moduli measurements. We compare and discuss the 
results obtained with those given by application of Bootstrap Resampling on 
the same model. 


1l. Introduction 


Expressing physical quantities in the frequency domain requires dealing with 
variable values belonging to the field of complex numbers. A typical example is 
the phasor representation of voltages, currents and impedances in ac electrical 
measurements. The evaluation of measurement uncertainty with analytical 
methods is well established for real-valued quantities [1]. However, the 
application of these methods may be difficult when complex-valued quantities 
are involved in the measurement model. In this case, numerical techniques, such 
as Monte Carlo simulation (MCS) and bootstrap resampling (BR), may be 
advantageous since they can be implemented in a comparatively simple way, 
thus avoiding analytical difficulties. 


" Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
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As a case example, in this paper we study the impedance uncertainty evaluation 
when the three-voltage method of measurement [2] is used. 


2. The three-voltage method 


The three-voltage method is employed to measure an unknown impedance Z, 
by comparison with a reference impedance Z, . The principle is shown in Fig. 1: 
Z,and Z, are put in series and energized with a current 7 from a generator G. 
Voltages U, and U, develop on Z, and Z,; vector linear combinations U, of 
U, and U, can be generated with the aid of an inductive voltage divider IVD; 
in Fig. 1, the combinations U =U, +U, and U,, =(U, —U,)/2 are shown. 
The rationale of the method lies in the much higher relative accuracy of rms (10° 


°) versus vector (10%) voltmeters: by measuring three rms values (vector 
moduli) Ux, U, U| an equation Z, = flu, U, „|U, a can be written to 




















2 


recover Z, as a vector quantity. 








Figure 1. (left) Principle schematics of the three-voltage method. (right) Vector 
diagram. 


When U,, is measured (a good choice for practical measurements), the model 
equation can be written as: 








xl. _ Wal. 
or | Tod 
a, =- {l-Qe, ~ ay} aa aa = (+a -4a ); (1) 


Re(Z, ) _|@, ~a; | | Re(Z,) 
Im(Z,)| la, a, | | Im(Z,) 
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The model is strongly nonlinear and difficult to treat analytically even in its 
basic form. In practice, a more refined model taking parasitic effects into 
account is normally used. An implementation of the method, which defines both 
impedances as four terminal-pair coaxial standards, has been developed at IEN 


[3]. 


3. Numerical analysis 


For the present analysis, we considered the simplified model (1) in which the 
uncertainty contributions are: 


- noise in U|» U,|> andy, 


- uncertainty of Z,, assumed bivariate uniform. 


, assumed Gaussian and uncorrelated; 








We assigned a priori values to Z,,Z,and / and u(|U|) to obtain a sample of 
Uyl, (k=1...n). On the same 


dataset we performed two different numerical uncertainty analysis. 


3xn (n= 40) simulated readings {u,|,|U, 








> 


3.1. Monte Carlo simulation 


“Experimental” means and stds from the sample were used as parameters of 
three Gaussian distributions, from which m Monte Carlo samples were 
generated (each made of 3xn voltage values). 

To each 3xn sample, one systematic contribution taken at random from the 
bivariate uniform pdf of Z, is associated. Each individual set of three voltages, 


and the extracted Z, value (the same for each set), is then processed in the 


model. This technique is preferred to that suggested in the forthcoming GUM 
Supplement 1 [4], due to the high nonlinearity of the model. On the 7 resulting 
Z, Values the chosen estimator (mean) is computed. 


The result is a set of m estimates ofZ,, from which the bivariate pdf is 


constructed and confidence domains can be computed. In practice, from the 
Z, set, we obtained the 95% shortest intervals of confidence for Re[Z,]and 


Im[{Z, ]- 


3.2. Bootstrap resampling 


In this case, the analysis is the same of Par. 3.1., except that, following a prior 
investigation [5], from the original simulated sample we obtained m bootstrap 
resamplings (each of 3xn voltage values). 
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4. Results 


Table 1 shows the results of both methods, for a comparison between a resistor 
Z, = (1000 + j10)Q with u(Re[Z,]) = u{Im[Z,]) =5mQ, and an inductor 
Z, = (80 + j200n) Q, using 7=2 mA, u(|U|) = 10 pV, m= 40000. For brevity 
only full interval widths W(e) are given (corresponding coverage factors can be 
computed as k =W /(2s)). Fig. 2 shows an example of output bivariate 
distribution for Z, . 


Table 1. Comparison between MCS and BR, for various 
combinations of input uncertainties (values given in the 
text). s(e) is the standard deviation, W(e) is the width of the 
95% shortest interval of confidence. 


Case u(Zs) u(U) s(Re[Zx]) W(Re[Zx]) s(Im[Zx]}) W(Im[Zx]) 


mQ mo mQ mQ 

MCS Yes Yes 3.7 13.8 3.3 11.6 
BR Yes Yes 3.6 13.4 3.3 11.6 
MCS No Yes 2.0 7.8 0.9 3.6 
BR No Yes 1.9 7.4 1.0 3.9 
MCS Yes No 3.2 10.5 3.2 10.5 
BR Yes No 3.2 10.5 3.2 10.5 







0.005 


0 
Im[Zx](Q) 0.005 0 
Sic ee, OO Re[Zx] (Q) 


Figure 2. Output bivariate distribution for Z „ (origin in the estimate). 
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5. Comments 


Both methods are able to construct multivariate distributions for vector-valued 
measurands, with arbitrary models. 

The MCS implementation here proposed slightly differs from that suggested in 
[4]. However, although it is computationally heavier, we believe that it is more 
suitable for highly nonlinear models. 

As concerns the differences between MCS and BR, the latter is more respectful 
of the experimental information, whereas the former uses statistics depending to 
some extent on the experimenters’ judgement (type of input pdfs, evaluation of 
input correlations). For example, BR automatically takes into account possible 
correlations between input data. 

BR tends to slightly underestimate intervals of confidence even with sizeable 
samples (n=40 in our case) except in the obvious case in which there is no 
experimental noise, and the two methods degenerate in one (last two rows of 
Tab. 1). 

In conclusion, we think that BR deserves consideration for those experimental 
situations, frequent in automated measurements, in which large samples can be 
obtained. For small sample sizes, MCS should be the preferred choice. 
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In this paper we focus on the main problem of quantum state tomography: the 
reconstructed density matrices often are not physical because of experimental noise. 
We propose a method to avoid this problem using Bayesian statistical theory. 


The emerging field of quantum information science exploits quantum 
mechanics to achieve information processing tasks! that are impossible in 
the classical world. In order to complement and strengthen these technolo- 
gies, standards and measurement methods must be developed in the realm 
of quantum metrology. In particular, a central role for benchmarking these 
quantum information technologies will be played by quantum states to- 
mography (QST)? and quantum process tomography”. In this paper we 
focus on QST, and in particular we propose an estimation method based 
on Bayesian statistical theory? to obtain a properly defined reconstructed 
density matrix (DM). 

The more general DM of a quantum system in an N-dimension Hilbert 
space can be expressed as p = (1+ pe s;9;)/N, where the N? —1 
Stokes parameters s; are real. {A} is a set of N* — 1 hermitian matrices 
with null trace (Tr{Q;} = 0) satisfying TAG} =N 6;,;- In order to have 
a properly defined DM the coefficients s; satisfy the inequality So s2 < 
N — 1; thus the coefficients s; can be considered as the coordinates of points 
in a hyper-sphere (H-S) in N? —1 dimensions of radius VN — 1. Each point 
of the H-S corresponds to a possible DM: the points on the surfaces are pure 
states, and the center of the H-S is the completely mixed state (this is the 
generalization of the Bloch sphere for a two-dimension Hilbert space’). 


*Work partially funded under EU SofTools-MetroNet Contract N. G6RT-CT-2001-05061 
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The QST process is based on the ability to reproduce a large num- 
ber of identical states and perform a series of measurements on comple- 
mentary aspects of the state within an ensemble in order to obtain the 
DM of the quantum state from a linear transformation of experimental 
data?. In this case we consider a set of N? — 1 projective measurement 
operators {Pm}; thus the probability of observing the measurement out- 
come m is pm = Tr{Pmp}. As the Pm are hermitian with trace one, 
we can write Pin = 1/N + ee Ee. leading to the linear rela- 
tions between the probabilities of measurement outcome and the Stokes 
coefficients pm = 1/N + pairs PmnSn- By a proper choice of the mea- 
surement operators Pin, the matrix lmn can be inverted and the Stokes 
coefficient can be obtained directly from the probabilities of outcomes 
as Sn = Day Ta (Pm — 1/N). However a significant drawback of this 
method is that the recovered DM might not correspond to a physical DM 
because of experimental noise. For example, The DM of any quantum sys- 
tem must be hermitian, positive semi-definite with unit trace; nevertheless 
the DM reconstructed by QST often fails to be positive semi-definite. 

In this paper we study the manipulation of experimental data from a 
quantum experiment. As quantum theory is a set of rules allowing the com- 
putation of probabilities for the outcomes of tests which follow a specified 
preparations!, Bayes statistical theory appears to be particularly suitable 
for this data manipulation. We link the Bayesian notion of prior probability 
to the information about the quantum state preparation! *. This informa- 
tion attaches some a priori physical constraints allowing proper physically 
valid DM reconstruction by QST. In particular if we do not have any prior 
belief the prior probability distribution (Pri-PD) of our system is a uniform 
distribution in the Bloch H-S P({s;}) = V-10(N—1—3-%_"? s2), where © 
is the step-function and V is the volume of the H-S of radius VN — 1. The 
posterior probability distribution? (Pos-PD) is determined by means of the 


experimental results according to P({s;}|{E}) = NT PCa , 


where {FE} is the set of experimental results and L({E}{s;}) is the like- 
lihood function of the experimental results giving the Stokes parameters 
s;. The point estimator of s; is often taken to be the mean of the Pos-PD 
of s;; in this way we always obtain the properly defined DM. This estima- 
tor of s; and the maximum likelihood estimator converge asymptotically 
to the same value and they are equally efficient. We have to consider that 
our estimation comes from experimental data as well as from our prior be- 
lief. A prior belief corresponding to the physical situation may bias the 
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estimation. For this reason it is suggested to perform a sequence of mea- 
surements where the Pos-PD of a measurement is used as the Pri-PD of 
the subsequent measurement. We performed some numerical simulation in 
the case of pure and mixed states of an Hilbert space of dimension 2, using 
the Adaptive-Genz-Malik algorithm, obtaining an appreciable convergence 
after only three iteration of the procedure above, even in the presence of 
biased Pri-PD. 

We discuss also the case when the experimentalist needs several copies 
of a quantum system in a specified state (point Q in the Bloch H-S) to 
perform a certain experimental task. In practice he is able to prepare the 
quantum system only approximately in that specified quantum state, but 
every experimental task may accept a certain tolerance in the quantum 
state preparation (described by a small H-S centered in Q). We exploit 
the Bayesian approach to test whether the prepared quantum states is in 
the intersection between the Bloch H-S and the small H-S Q. We refer to 
hypothesis Q when we consider the Pri-PD z({s;}) (a step-function that is 
nonzero in the intersection between the two H-Ss), and as the alternative 
hypothesis A we choose the Pri-PD P({s;}). We define the Bayes factor 
Bas B = e ETATE: B < 1 is in favour of Q against A, while 
B > 1 is in favour of A against Q. The experimentalist performs the test 
according to the following procedure: he performs the first measurement 
and he calculates the parameter B and the Pos-PD using as P({s;}) the 
uniform probability distribution in the Bloch H-S. Assuming that the re- 
sults is B < 1, he performs a second measurement and he calculates B 
and the new Pri-PD using as P({s;}) the Pos-PD obtained from the first 
measurement. This procedure is applied iteratively until he obtains B > 1. 
The Pos-PD of this last measurement allows the experimentalist to decide 
if the prepared states are in the region of acceptable tolerance or not, for 
example by estimating the Stokes coefficients s; and checking if they are in 
the intersection between the two H-Ss. 

We are deeply indebted to W. Bich for helpful suggestions. 
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Based on the orthodox theory of single electronics, a simulation of a tunnel junction is 
performed, aiming at investigating if quasiparticle events are predictable to transfer 
fractional charge. The related outcome from the software package MOSES‘ (Monte-Carlo 
Single-Electronics Simulator) is discussed. 


1. Introduction 


The mnovative impact of Single Electron Tunneling (SET) theory and practice 
is well known in many scientific and technological areas [1]; one field of 
application is metrology, where many interesting researches are being 
developed worldwide. [2] The flowering of SET applications involves both the 
nano- and meso-scales, eventually contributing to a redesign of perspectives of 
investigations into the constituents of matter too. It is known that findings at the 
intra-atomic level are jeopardising electrons on a charge of non-elementary 
particles: two decades ago, quasiparticles were proposed to carry fractional 
charge and discussed by Laughlin; [3] such quasiparticles have now been 
successfully investigated in a very recent work, with application to quantum 
Hall systems. 

This paper is concerned with the so-called orthodox theory [5] of single 
electronics, based on a model where the electric charge is a continuous 
. quantity—constraints on granularity being reduced at a sub-e scale, where 
electrical shielding phenomena rise at the Debey length. The intended goal is to 
investigate by simulation if also quasiparticles (instead of electron particles 
only) are predictable, in the framework of the theory, to transfer fractional 
charge in a SET device. Emphasis is given on the behaviour of a current-biased, 
resistively-loaded tunnel junction described in Fig. 1.[5] The research work is 
developed by use of the software package MOSES’ (Monte-Carlo Single- 
Electronics Simulator). [6] The outcome is discussed, taking into account some 
side-issues already focused in a related work on computational aspects about 
counting electrons one by one. 


* Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
° © 1995 by R.H. Chen ; © 2001 by D.MLR. Kaplan and by Ö. Türel. 
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2. Simulation: orthodox theory and its application 


The orthodox theory is exhaustively expounded by Averin and Likharev: [5] its 
main result is that single-electron tunneling across a SET device is always a 
random event, occurring at a rate (its probability per unit time), that depends 
solely on the reduction of the free electrostatic energy of the device, as required 
for the event to occur. The process that makes possible the transfer of a single 
electron despite the crowded junction electrodes (with typically 10° free 
electrons) is a consequence of the Coulomb repulsion of electrons. In order to 
observe SET effects, temperature constraints are related to the tunnel dimension, 
approximately: 4.2 K (liquid helium) at 6.0 nm; 77 K (liquid nitrogen) at 1.6 nm; 
300 K at 0.8 nm. The theory makes the following major assumptions: the 
electron energy quantization inside the (metal) conductors is ignored; the time 
for electron tunneling through the (insulator) barriers in negligible when 
compared with other time scales; coherent quantum processes responsible of 
possible simultaneous (co-tunneling) events are ignored. Focusing on the subject 
of the present research, a dummy assumption that can be made explicit regards 
the charge flow; if Q, is the charge transferred across the junction during a given 


time interval Af, then (Fig. 1) Q, = g-Q, where q is the time integral of h= Io- L: 
g= i (df. (1) 
At 
According to Eq. (1), the charge is theoretically allowed to take also any 


fractional value of e, making events of quasi-particle transfer likely to occur, 
being in principle constrained by the time interval At. 


tunnel 
junction 





Figure 1. Current-biased, resistively (R,) loaded tunnel junction: J, , Q , C, and G, 
tunnel current, charge, capacitance and conductance; U voltage input between nodes 1 
and 2 (reference); @ voltage between island 3 and node 2; /s and Gs, shunt current and 
conductance (ignored in simulation). 
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Table 1. Simulation response: time-average jumps, island charge Q and voltage gy. Three 
runs data. 


Jumps Island charge p 
1.03 0.996 1.04 0.431 | 0.471 | 0.455 | 0.436 


0.732 | 0.679 | 0.707 | 0.393 | 0.415 | 0.444 | 0.428 
0.503 | 0.516 | 0.521 | 0.379 | 0.436 | 0.384 | 0.420 





Fractional charge tunneling in a SET device (Fig. 1) has been simulated, using 
MOSES“. [7] The simulation parameters are normalized in terms of capacitance 
C, and conductance G, (once the simulation is over, the results can be converted 
to real units, by introducing the actual values for C, and G,). Used units are: 
charge: e; energy: e?/C,; voltage: e/C,; time: C,/G,; temperature: e7/(kC,) (k, 
Boltzmann constant) e.g. , if C=10""° farad, the unit temperature is 18 K. In our 
full Monte-Carlo simulation all input data are set to 1, but temperature 0.001, 
and single run iterations are performed with: time step: 107, number of periods 
per run: 10°, max total time per run: 10°, time period: 10°. After exploratory 
trials, the dc voltage is tuned below U=0.4525, corresponding to single whole-e 
jumps across the tunnel. Tab. 1 shows some results, including time-averaged 
sub-e jumps. | 


3. Final remarks 


Fractional jump values, relative to the net jump of one particle e, can be 
interpreted as the tunneling of corresponding sub-e particles. The simulation 
result is a criticism to electron individuality, opening sub-e perspectives too. 
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Abstract 

To comply with point 5.4.5 of ISO/IEC 17025 standard means that the laboratories 
“shall validate non-standard methods, laboratory designed/developed methods, standard 
methods used outside their intended scope, and amplifications and modifications of 
standard methods to confirm that the methods are fit for the intended use” [1]. This 
requisite of the standard is new and the laboratories are evaluating the approaches for 
the validation process in terms of the quality system. A procedure for validation of 
calibration methods is proposed and an example of validation of results of 
measurements is described. 


1. Introduction 


To fulfil the requisites of the ISO/IEC 17025 standard [1] (General 
Requirements for the Competence of Testing and Calibration Laboratories) 
means that the laboratories define their quality objectives, the performance of 
the tests, reference materials certifications and calibrations are in agreement 
with the methods described in technical procedures, with the clients 
requirements and with the use of the good professional practices; the validation 
of methods and results is performed; the fulfilment of the objectives of the 
activity plans is revised by the management; the performance of the processes 
are evaluated, as the audits reports, the satisfaction inquiries, the information 
from the complaints, preventive and corrective actions are carried on and the 
outputs from the working groups for improvement are considered. 

A brief overview of the validation concept will be considered in this text 
and the fulfilment of point 5.4.5 of the ISO/IEC 17025 standard. A procedure 
for validation of calibration methods is proposed and an example of validation 
of results of measurements is described using the statistical technique 


* Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 


286 


‘Experimental Design’, active tool that enables inferences about the results, 
namely from reproducibility measurements and methods comparisons. 


1.1 General principles and concepts 


The ISO standard 9000: 2000 [2] defines "validation" as “confirmation 
through the provision of objective evidence that the requirements for a specific 
intended use or application have been fulfilled... © ". 

The ISO IEC 17025 states at point 5.4.5 that ‘the laboratory shall validate 
non-standard methods, laboratory designed/developed methods, standard 
methods used outside their intended scope, and amplifications and modifications 
of standard methods to confirm that the methods are fit for the intended use’. 

As a consequence the laboratories should describe the way the validation of 
calibration methods is done and this description should be a part of its quality 
documents. In the validation process, the method is evaluated in terms of its 
representativeness, repeatability and reproducibility ® and considered its ability 
to satisfy the defined requirements, its technical capability and the expected or 
required uncertainty of the results. 


2. Procedures for method validation 


The Laboratory Quality Manual will state the general principles to observe 
in the validation of the calibration methods. Each calibration procedure will 
include also a validation chapter that describes the way the validation is 
performed, the calibration requirements and how to fulfil these requirements, 
and the statistical tools to be used to validate the results. To accomplish these 
goals, studies or developments will be performed, taking in consideration if the 
methods are internal, new or used by other laboratories, or reference methods. 

At EUROLAB Report [3] two major approaches are considered, for the 
validation process, the scientific approach and/or the comparative one. 

In the first approach, the scientific approach, the method representativeness 
is evaluated by the published scientific and technical literature, by the research 
and developments performed by the laboratory, by simulation and modelling, by 


° In the ISO 9000:2000 are this two notes: Note 1: The term “validated” is used to designate the 
corresponding status and Note 2: The used conditions for validation can be real or simulated. 

* (VIM, 3.6 and 3.7 [4]) Repeatability (of results of measurement) - Closeness of the agreement 
between the results of successive measurements of the same measurand carried out under the same 
conditions of measurement. Reproducibility (of results of measurements) Closeness of the agreement 
between the results of measurements of the same measurand carried out under changed conditions of 
measurement. 
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the method robustness and by the studies for the optimisation of the laboratory 
performance. In the other approach, the comparative one, the method is assessed 
by comparing its results to those obtained by other validated method, which has 
been developed for the same purpose. The comparisons may be performed with 
standard methods, reference methods, international standards or certified 
reference materials and/or interlaboratory comparisons. In the method 
comparisons the results must fall within the comparison uncertainty interval. 

Finally, the validation of the method will be obtained by the combined use 
of the above-described procedures. 


3. Validation of results 


3.1 General Model 


Calibration laboratories are also required to validate the results of 
measurements and the associated uncertainties. The laboratory designs the 
experiment and an adequate number of measurements are performed. For the 
results analysis it is proposed the use of one of the statistical tools of the 
“Experimental design”. 

These tools are used generally for the improvement and optimisation of a 
process. It is an active method that allows through the changes in the inputs and 
observing the corresponding changes in the outputs, to make the inference by 
rejecting the null hypothesis (Ho) of which outputs are statistically different for a 
significance level æ [5], also known as the “producer’s” risk“. 

Inversely it can be used to test the homogeneity of a sample, for the same 
significance level, to identify which results can be considered as outliers and the 
way to deal with them in a SPC® concept and what is the uncertainty associated 
to the sample. The procedure to be described is the well-known analysis of 
variance (ANOVA). 


3.2 Example of validation of calibration results of a comparison of two 
thermometric tin fixed point cells, 


Short description of the laboratorial work 

For the comparison of the two tin sealed cells, we have used two furnaces 
in order to realize the tin freezing point (¢=231,928 °C) simultaneously with 
each cell in its own furnace. Were performed 9 different freezing point plateaus 


° The probability of making the so-called type / error, to reject the null hypothesis and it is true. 
f In this case sample is the set of measurements. 
® SPC — Statistical Process Contro! 
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for each cell in consecutive days, and utilised 3 standard platinum resistance 
thermometers (SPRTs) A, B and C at each plateau, that allowed to obtain, each 
day, three sets of measurement differences. It was chosen the “randomised block 
design without interaction” to evaluate if there were effects between the 
measurements performed at each day and/or between the measurements 
performed by each thermometer. 

If no effects are detected the total average of the differences of the two cells 
will be a good estimator of the difference of the cells. 


1006 5 


AT (K) 


0,00 


-5,90 





is 10,06 


Time {hours} 


Figure 1: Freezing point plateau of a tin fixed-point 


The model 

If we have different effects that we wish to compare, each group of 
observations for each effect is a random variable. It is possible to describe the 
observations by the model: 

mn observations 


Y= Lys | (i =1,2,. m; j =1,2,...,7) 

Yy FHtT, +P; +E; 

é, ~ N(0,0°) 
where y; is the (ij) observation, u the overall mean, 7; the i factor-line effect 
(also known as treatment), 2 the j» factor-column effect (also known as block) 
are considered as deviations from the overall mean and ¢, the random error 


component, normally and independently distributed with mean zero and 
variance o’. The variance is assumed to be constant. 


To test the null hypotheses: 
Hp: 0,=7,=.. =T,=9 and Horm: B= B,=...= B, = 0 
We use the statistics: 


MS 


Line 
F, a Line ie 


MS a@,m-1,(m-~1)(n-l1) 


Error 


MS 


= Column __ F 


MS a,n—1,(m-1)(n-1) 


Error 


| aad 
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where: F — sampling distribution “’; MS - mean square between factors (line or 
column) is the unbiased estimate of the variance o2 if H, is true, surestimate of 
o7 if it is false and MS,,,,. - mean square error is always an unbiased estimate of 
the variance 0. 

If Fre > Fy m1 mein) and FCoum > Fa ni, m-pa- the null hypotheses are 
rejected for a significance level æ and the results are statistically different. If the 
above conditions are not true, the null hypotheses are not rejected for a 
significance level a, the results are “not different” and the samples belong to the 
same statistical population.Table 1 displays the 27 obtained temperature 


TABLE 1 
Tin Freezing point - Comparison of two cells 


Temperature differences 


0,11 0,07 0,05 0,02 0,00 0,05 
0,09 0,09 0,01 0,00 0,05 0,07 
0,06 0,15 0,06 0,00 0,06 0,14 





differences between the two cells (see also Fig. 2). From the ANOVA Table 1 
we have obtained two Fo values that will be compared with the F distribution 
for a= 5% and 2 and 16 degrees of freedom, Foo5 2 16= 3,6337 for the SPRT 
effect and F distribution for œ = 5% and 8 and 16 degrees of freedom 
Fo0s,7,16 = 2,5911 for the measurements/days effect. 

The null hypothesis is rejected for the measurements / days effect but not 
rejected for the measurements performed by the three thermometers. 





Cc 
Temperature 
B differences 
{mK} 











Ei 2 3 4 5 6 7 8 9 
Measuremens (days) 


Figure 2: Schematic representation of the observed temperatures 
differences 





" F distribution — Sampling distribution. If y,? and y, are two independent chi-square random 


variables with u and v degrees of freedom, then its ratio Fa» is distributed as F with u numerator and 
v denominator degrees of freedom. 
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3.3 Criteria for dealing with outlying measurements 


It is proposed to use a sample statistic to be compared with a critical value 
in order to determine the measurements to reject. 


ANOVA TABLE |] 
Sum of Degrees of Mean 
quares freedom Square 
Factor-line 0,0017 1,8787 


Factor-column 0,0029 3,2539 
0,0009 





From ANOVA calculations we have detected the existence of a day or days 
where the values may be considered as outliers. 

The method to detect these measurements was the “Test on the mean of a 
normal! distribution with unknown variance” for a significance value of 5%. The 
normality assumption is needed to formally develop the test but “moderate 
departures for normality will not seriously affect the results” [5, pp.79]. 


The test statistic used is ¢, = X- Ho and the null hypothesis will be 


s/Vn 


rejected if t, [>t a/2,n1 Where fa/2 n118 the upper @/2 (bilateral test) of the 





t-distribution with (n-/) degrees of freedom. 
The average of the 27 measurements will be considered as the mean s and 


equal to 0,066 mK. The 9 daily averages will be used to calculate the fy 44y- 


Table 2 drafts these values. Comparing them with foo, =27515 we 


conclude that the measurements from the day 6, are to be rejected. 


TABLE 2 
Da Te ae a ae ee ee 
Da ae ae ee A 


lo day 3,2835 | 1,6212 1,1492 | 0,7798 


Now we recalculate the ANOVA Table omitting the values from day 6. The 
obtained Fo values will be compared again with the F distribution for æ = 5% 
and 2 and 14 degrees of freedom, Fos, 2, 14 = 3,7389 for the SPRT effect and F 
distribution for œ = 5% and 7 and 14 degrees of freedom Fos, 7, 14 = 2,7642 for 
the measurements/days effect. 
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The null hypotheses is not rejected, so there is no effect between the 
measurements performed with the different thermometers or between the 
measurements/ days, for a significance level æ = 5%. 


ANOVA TABLE 2 
Degrees of 
freedom 
Factor-line 0,0020 2,0746 
0,0017 1,7153 

0,0010 


Mean Square Fo 


Sum of squares 





Alternatively, it may be stated that there is no significant difference 
between the 24 sets of measurements, they belong to the same statistical 
population and the average (AT = 0,073 mK) and the variance of the mean 
(or = 5,28 x 10° mK’) with 23 degrees of freedom are good estimates, 
respectively, of the mean and the variance of the differences of the two fixed 
point cells. 

We are able now “to provide objective evidence, that the requirements for a 
specific intended use or application have been fulfilled...” what is to say that we 
have performed the validation of the comparison measurement results. 

As a consequence, the obtained standard deviation of the mean 
(oO ar = 0,0073 mK) will be appropriate for the evaluation of type A component 
of uncertainty of the comparison. The type B components, from prior 
information will be added in order to complete the uncertainty budget. 


4. Conclusion Remarks 


It was outlined an approach to answer to point 5.4.5 of the new ISO/IEC 
17025 standard and described what should be included in the Quality documents 
for the validation of the methods and results. 

An application of the randomised block design has been used to illustrate a 
practical realization of validation of a comparison of two thermometric fixed 
points results in repeatability/reproducibility situation that can be easily applied 
to the calibration results. 
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Various least squares methods have been compared in respect to the straight 
line fitting to data sets with errors on both variables, to check the benefit in 
using the most appropriate method for dealing with heteroschedastic data, the 
element-wise total least squares (EW TLS). It is found that the EW TLS always 
gives the correct estimate: weighted least squares can sometimes be also a good 
approximation, but this cannot be guessed a priori. 


1. Introduction 


When the experimental data are affected by errors in both variables, the 
metrologist faces the problem that the ordinary least squares approximation 
(OLS) does not provide an optimal answer in estimating the regression 
parameters and their uncertainty. The same situation arises when the data are, 
additionally, heteroschedastic and the weighted least squares approximation 
(WLS) is used, since the simple weighting operation does not adjust for errors 
in the independent variable. 

In an OLS problem the minimization of sum of squared distances along one 
coordinate axis is performed, while TLS aims to minimize sum of squared 
perpendicular distances to the best line and the TLS correction is always smaller 
than the one obtained with OLS. In the present work a method, recently 
developed by two of the authors (AP and MLR) called element-wise TLS (EW 
TLS), has been considered. It uses an iterative procedure to approximate a TLS 
solution for the heteroschedastic case [1]. Since it does not presently allow a 


" Work partially funded under EU SofTools_MetroNet Contract N. G6RT-CT-2001-05061 
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direct estimation of the model parameter uncertainty, its results have been 
compared with the OLS and the WLS ones. 

Two series of data has been considered in this testing. The first is a set of 
real data under study in thermal metrology, where the range of the uncertainty 
values is quite broad for the independent variable. The second series is a 
simulated set of data constructed in order to emphasise the benefit of the use of 
the correct method, the TLS. 

The uncertainty of the model parameters has been evaluated by means of a 
bootstrap technique [2], and the overall uncertainty with a Monte Carlo 
technique [3] taking into account the uncertainty on both variables of each 
experimental point, both in the cases of OLS and WLS. 


2. Comparison of LS Evaluations with Real Thermal Data Series 


Table 1 reports the set of data. The range of uncertainties is much broader for 
one variable than for the other. In addition, the data arise from different series of 
independent measurements: the position of the zero 1s arbitrary and the relative 
position has been set visually. Therefore, they have been considered either as an 
overall set or as subdivided in the original series, in order to additionally detect 
with the fixed-effect model technique [4] a translation effect or, equivalently, 
possible biases between series when using OLS or WLS. 

The main characteristics of the data are: 


different U(Lab) and U(x) from point to point 

unknown relative position Ay between Lab measurements 
one outlying point at visual examination (in grey in Table 1) 
linear AT vs x model: y=i+sx 


Table 1. Thermal data test series. 





# # x # # x 
Ui U(x U) U(x) 
1 1 Labi -75 0,045 275 0,2 |] 2 3 Lab2 -60 0,030 27.5 0,2 
2 Labl -60 0,045 275 0,2 6 Lab2 -15 0,030 33,4 0,2 
4 Labi -30 0,045 291 0,6 14 Lab2 350 0,030 101,0 0,3 
9 Labi 20 £0,045 458 06 4] 3 5  Lab3 -30 0,020 29,1 0,6 
10 Labl 20 0,045 462 3,2 7  Lab3 18 0,020 45,8 0,6 
I2 Labi 295 0,045 91.6 08 8 Lab3 60 0,020 45,8 0,6 
15 Labl 370 0,045 1038 2,0 | 4 11 Lab4 20 0,107 46,2 3,2 
16 Labl 460 06,045 123,0 6,0 13 Lab4 300 0,070 91,6 0,8 
19 Lab4 440 0,070 124.0 6,0 
18 Labi 470 0,045 1240 6,0 21 Lab4 580 0,070 154,9 1,6 


20 Labl 635 0,045 1549 1,6 
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The results of the estimation with different LS methods are reported in Table 2. 
The data of Labl, which represent about half the set, obviously appear 
homogeneous, if the outlier is omitted; the omission halves the standard 
deviation of the fit residuals and also significantly reduces the bootstrap 
confidence interval. The use of Mathematica© software package gives a slope 
value, which represents the most important indicator for these data, s = 5,2. 


Table 2. Results on the thermal data test series. 


OLS WLS EW TLS OLS OLS WLS 
Intercept i Bootstrap (95%CD with fixed effect 
All data 
All Labs -202 -204 -205 -(184-220) -209 -207 
s.d. 23 re 23 24 22 22 


Labi 






-214 -(186-234) 













s.d. 
No outlier 
All Labs -210 -208 -208 -(189-223) -207 -206 
s.d. 17 17 17 17 17 
Labi -217 -217 -(198-240) 
s.d. 14 225555 14 
Slope s Bootstrap (95%C1) with fixed effect 
All data 
All Labs 5,26 5,01-5,52 5,34 5,32 
Lab1 5,39 4.98-5.68 
No outlier 
All Labs 5.34 542 542 5.13-5.58 5.43 5.39 
Labi 553. 5,53 5,36-5,76 











The intercept value estimates show no significant difference (range of 15) from 
OLS to EW TLS estimation, compared with the bootstrap 95%CI interval (34- 
48). However there is a decrease of about ~5% when the outlier is discarded, 
comparable to the bootstrap 67%semi-CI (0,08-0,12). 

The slope estimates show a range 5,26—5,53, i.e. a variation of 0,27, but a 
definite increase of the slope of about 0,1-0,2 when the outlier is omitted, 
comparable with the bootstrap 67% semi-CI (0,10-0,17). By using the WLS 
almost the same results of the EW TLS are obtained. For WLS, it may be 
significant to apply it also to the reversed variables: in this case, one gets a set of 
new slope values lying in the same range. 

The application of the fixed effect method provides the optimal positions of 
each series with respect to one taken (arbitrarily) as the reference, here Lab]. 
Only the translation for series Lab4 when the outlier is omitted, Ai = —20, is 
statistically significant, being higher than the standard deviation of the fit. The 
slope and intercept values lie within the previous ranges. 

An overall estimate of the uncertainty has been obtained with a Monte 
Carlo simulation approach: 1000 sets of data are randomly sorted from the 
bidimensional rectangular pdf defined by the 95%CI of either variable; the 
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parameters obtained by fitting the simulated datasets are used to infer their 
uncertainty. The best results (Fig.l) give an uncertainty &wLs, min(y) = 32,6 
occurring at x =55+3, corresponding to about half the fit best standard 


deviation. 


Figure 1. 95%CI boundaries of the Monte Carlo simulation using WLS and no outlier. 


3. Comparison of LS Evaluations with Simulated Data 


Table 3 reports the set of simulated data (single series), also based on real 


0,7 
0,6 
0,5 





metrological data. In Table 4 are the results with different methods. 


Table 3. Simulated data. 


# 


1.0 
2,0 
3,0 
4,0 
5,0 
6,0 
7,0 
8.0 


on AA WN 


Table 4. Results on simulated data (fit s.d. = 0,050). 


WLS 
Intercept i 


U(y) 
0.28 
0,30 
0,18 
0,30 
0,25 
0,30 
0,12 
0,20 


OLS 


0.00741 


1,00036 


X 


0.992 
2,0076 
3,0027 

3,986 
4,9595 

6,126 
7,0091 

7,988 


U(x 

0.28 
0,30 
0,18 
0,30 
0,25 
0,30 
0,12 
0,20 


0.01550 
Slope s 
0.99987 


# 


9 

10 
11 
12 
13 
14 
15 
16 


Uy) 
9.0 0.18 
10,0 0,50 
11,0 0,25 
12,0 0,28 
13,0 0,17 
14,0 0,77 
15,0 0,24 
16,0 0,25 

EW TLS 


0,01159 


PS pe sl ype 


0,99753 


re Pe an 


x 


8.9973 
10 
11,1122 
11,9556 
13,0455 
13,9664 
15,045 
15,9744 


U(x) 
0.18 
0,50 
0,25 
0,28 
0,17 
0,77 
0,24 
0,25 
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In this case, the WLS is only marginally different from OLS. On the contrary, 
the use of EW TLS, the mathematically accurate method, gives significantly 
different results, as compared with the fit uncertainty: the intercept is about 20 
lower than the OLS and WLS estimates. Monte Carlo simulation estimates are 
significantly better for WLS than for OLS: e(y)ors = 0,42 at x = 7,46 + 0,21 
(Fig.2). The value éwis.min() = 0,33, at x = 7,98 + 0,16, corresponds to the fit 
standard deviation. 


on kö Ow 





Figure 2. 95%CI boundaries of the Monte Carlo simulation using WLS. 


4. Conclusions 


The comparison of different LS methods applied to the case of data with 
heteroschedastic errors on both variables have shown that the WLS may be 
sometimes sufficient, depending on the type of weighting scheme with respect 
to the EW TLS method, which is a more computer-intensive method, but more 
mathematically accurate. Consequently, WLS could be sometimes sufficient, 
after a rational comparison and discussion made by performing several least- 
squares estimates. A Monte Carlo simulation using of the uncertainty CIs of the 
data on both variables, as pdf can be a powerful tool in providing an estimate of 
the overall uncertainty of the fitting model. 
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In roughness measurements often noise plays a role. Noise may give an offset in 
measurement parameter as noise makes the parameter deviate away from zero. In this 
paper we propose a method to correct for noise bias for the surface parameter Sg. By 
considering the decrease in Sq once an average over multiple measurements is made, an 
unbiased value for Sq is estimated by extrapolating the value to an infinite amount of 
measurements. It is shown that using this method for two measurements only, the true 
measurand is approached better than with averaging tens of measurements. This principle 
is extended to obtain a complete 'noise-corrected' surface by considering the power 
spectrum and the change of each Fourier component with averaging. Combining the two 
methods and considering the statistical significance of each Fourier component enables a 
further reduction. Examples and simulations are shown for the calibration of roughness 
drive axis and surface measurements. 


1. Method 


1.1. Parameter extrapolation 


The parameter Sq in surface metrology is a measure of power as it is the 
rms value of the surface. It is possible to use the mean square, Sq* to extrapolate 
the Sq to a power level with reduced noise bias. As the noise is independent of 
the measurement, the power of a measurement consist of the power of the object 
and the power of noise [1]: 


2 — «rae 2 
Sq measurement — Sq object +S d noise ( l ) 


If multiple measurements are taken, noise is reduced by the number of 
measurements n, but the power of the object remains: 


$4 mean(n) = Sa object +Sanoise |” 2) 
Rearranging 2 and 3 gives the power of the object: 
n: Sia (naini (3) 
oOo nl 
In figure 1, the results are given of the extrapolation for a simulated block 


surface with an Sq of 1.381 um. Measurement surface is 0.01 mm”, period 10 
um and amplitude 1 um. A normal distributed noise with a standard deviation of 


SY object = 


300 


1 um is added. As the standard uncertainty in the extrapolated sets does not 
decrease, e.g. averaging two profiles gives Sg =1.547 pm + 12 nm and the 
reduced pairs give Sg =1.376 um + 10 nm, it is possible the calculated 
parameter is lower as the ‘clean’ surface, but within the uncertainty limits. The 
bias effect is observed in surface plates [2] and noted for rms values of 
wavefronts by Davies [3]. Davies uses a combinatorial method to remove the 





inf8 4 2 1 

Number of measnrements Pioure 1, 
Extrapolated sets of two measurements 
compared to the mean of two, four and 
eight measurements and the singe results. 
The mean of eight files surface is in the 
upper right (Sq =1.419 um), the random 
phase exclusion surface is lower right (Sq 
= 1.374 pm). 





1.2. Fourier reduction 


The next step is to obtain a complete reduced surface. First a Fourier 
transformation is performed. The power, the squared amplitude, of a single 
frequency also consists of an object and a noise part. Following formulas 1 to 3, 
the amplitude spectrum of a noise reduced surface can be calculated. Inverse 
transformation is done with the phase of the mean of measurements. If a noisy 
frequency is encountered it is possible the power is reduced to a negative 
number. In that case the power of the frequency is set to zero. 


1.3. Random phase exclusion 


Even further reduction is possible. The Fourier terms are sorted by 
statistical importance. The more random a frequency, the larger the standard 
deviation in the mean of phases of that frequency will be and the less significant 
it is. Starting with the most significant frequency, Sq is calculated and compared 
to the extrapolated Sqosjec: calculated with equation 3. If the limit is not met the 
next frequency is added to the spectrum until S@surjace Meets Sqorject « 
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2. Calibration 
A drive axis of a roughness tester is calibrated with a randomizing method 


[4]. A reference plain is measured at 6 different positions. The reference and its 
phase is thus randomized while the axis remains constant. Using the noise 





Figure 2. Calibration of drive axis of roughness tester with random phase exclusion. 
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The main purpose of the paper is to present uncertainty evaluation of Standard 
Platinum Resistance Thermometers (SPRT). The three methods of calculation 
of expanded uncertainty are presented and the results are compared. 


1. The Challenge 


The national standard thermometer used for the range 13.8033 K to 273.16 K is 
calibrated according to International Temperature Scale, ITS-90, at the triple 
point of: Hydrogen, Oxygen, Argon, Neon, Mercury and Water. 

Those temperatures are produced using an appropriate cryostat. 
Instrumentation associated with the measurement laboratory set up consisted of 
fixed-point cells, platinum resistance sensors, alternating current resistance 
bridges, and standard resistors with thermostats and the thermostat for water 
cell. 

The reproduced temperature to be measured is defined by the value of the 
resistance of SPRT at each fixed point (which corresponds to each of the 5 
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points: Hydrogen, Oxygen, Argon, Neon and Mercury), compared to the 
resistance of the SPRT at reference point (triple point of Water). 

The input quantities considered as a type A, are characterized by probability 
density functions (pdfs), as t-Student’s and Gaussian distributions and type B as 
uniform pdfs. Three methods for calculating coverage factor [1] at a level of 
confidence of 0.95 are provided. Two methods are based on calculation of 
combined standard uncertainty. The first one uses an arbitrary coverage factor 
while the second uses an approximation of a coverage factor. The third method 
is based on calculating the convolution of input quantities. The results of the 
calculations of these three methods are compared for all 5 equilibrium points. In 
a previous paper [3] some preliminary information is given about applying the 
procedures described in here to working types of PRTs. By contrast, the present 
paper focuses upon “national” SPRTs by considering how components 
contribute to the overall uncertainty, and calculation of uncertainties for these 
units. 

The influence of all external quantities was considered as thoroughly as 
possible. It can be noticed that although most of these were non-correlated, 
some of them were. The effects of both the correlated and uncorrelated factors 
were taken into account in the measurement equation. 


2. Equation of measurement 


Measurement equation (1) presents the mathematical model of measurand as a 
quotient of the resistance of the thermometer placed in the medium at any fixed 
point, e.g. hydrogen triple point, and the resistance of the thermometer placed in 
the substance at the stipulated reference point. 


wa Bre Rr + ORn + ERr + ERr + OR yy + ORs 
Ry, Ry + Ro + OR + Ro + Roa + ORos + ORog 


Ry, — resistance at any fixed point with corrections 

Ro, — resistance at the reference point with corrections 

Ry; —resistance at any fixed point without corrections 

Ry —resistance at the reference point without corrections 

OR7; — correction of any fixed point due to chemical impurity 

Rr — correction of any fixed point due to hydrostatic pressure 

Rr; — correction of any fixed point due to perturbing heat exchanges 
ÔRr4 — correction of any fixed point due to self-heating 

ORs — correction of any fixed point due to AC/DC measurements 

ORo, — correction of the reference point due to chemical impurity of water 


(1) 
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Ro — correction of the reference point due to hydrostatic pressure 

Ro — correction of the reference point due to perturbing heat exchanges 

Rə — correction of the reference point due to self-heating 

Ros — correction of the reference point due to AC/DC measurements 

SRo — correction of the reference point due to SPRT internal insulation leakage. 


Resistance of the thermometer at fixed point is given by equation 
Ry =X qo Rete = (Xr + X71 KRg + ORs) + ORsp ) (2) 


Xt, — reading of the bridge at the temperature of any fixed point after 
correction 

Rst, — resistance of the reference standard resistor after correction 

Xr —reading of the bridge at the temperature of any fixed point before 
correction 

Rs —resistance of the reference standard resistor before correction 
(RS = 25.000289 Q) 

Xr — “noise” of the bridge caused by random effects 

dRs; — correction of reference standard instability (time drift) 

ORs» — temperature correction of reference standard. 


Input quantity connected with the resistance at the triple point of water: 
Ro = Xoc “Roe =(Xo + Xo ARs + ARs3 + Rs4) 6) 


Xos — reading of the bridge at the reference point after correction 

Rso, — resistance of the reference standard resistor after correction 

Xo —reading of the bridge at the reference point before correction 
(Xo =1.0208720) 

Rs —resistance of the reference standard resistor before correction 
(Rs = 25.000289Q)) 

OX) — “noise” of the bridge caused by random effects 

ORs3 — correction of reference standard instability (time drift) 

ORs4 — temperature correction of reference standard resistor. 

For the purpose of calculation of uncertainty all corrections are chosen to be 

equal to zero. 


3. Combined uncertainty 


Based on measurement equation the formula for combined uncertainty is given 
by Eq.4. 
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u2?(W)=c?|u?(X (X )+ 0? (KX, + c2 [u(x X,)+u? (dX, +e? |u? (R )+u?(SR,, )+u? (Rp) ARs )|+ 
+ e[u? (R )+ u? (Ra) +u? (Req )|+ 203044? (Rs) -r(R,, Rs ese (SR; red dw (OR,; ) 





(4) 
where: 
o OW ORe R o l OW Ry gat „Xr 
1T BRr ôR XoRs Xo ° OR êXoe XR X 
o OW Rr XT a OM R SAI i Xr 
* Rte ôTste XoRs t Roo ARsoe XR,  XoRs 
awo i Ww Xr 
C5 = asec C6 = 
ORz¢ s oKs AR E- Rs 


Taking into account that correlation coefficient 7(Rs,Rs) is equal 1 and c4 = -c3 
we obtain simplified Eq.5 


u2 (W) = cè lu? (xX, )+ u? (ax, j+ cfu (X, )+ (8X) c3 [u? (OR, )+ uR, l+ 
+ ej [u° (ARa) + u (BR, + 65 Yu? (BR + 65 Yu" (AR, ) 


(5) 
where: 
u (Xy)  — uncertainty of the F18 Bridge as stated in technical documentation 
u({®Xy) — standard uncertainty calculated based on series of 10 measurements 
u(Xo) — uncertainty of the F18 Bridge as stated in technical documentation 
u(OX,) —~ standard uncertainty calculated based on series of 20 measurements 


u(ORs,) — standard uncertainty of standard resistor, caused by a drift in time (calculated 
as a half year instability) 

ufORs,) — standard uncertainty of standard resistor due to temperature variation 
(calculated based on technical data of F-18 Bridge for +5 K ambient temperature 
variation and thermostat stabilization coefficient equal 30) 

u(ORs3) — standard uncertainty of standard resistor, caused by a drift in time (calculated 
as a one year instability) 

u(ORs4) — standard uncertainty of standard resistor due to temperature variation 
(calculated based on technical data of F-18 Bridge for 5K ambient temperature 
variation and thermostat stabilization coefficient equal 30) 

u(ORy,) — standard uncertainty of achieving the equilibrium state at the triple point of 
hydrogen = 0.17mK (value given by CCT/01-02 document) recalculated for 
uncertainty of thermometer resistance 


u(ORy2,) — standard uncertainty of hydrogen pressure measurement (due to imperfection 
of measurement of pressure) 


u(ORy3) — standard uncertainty due to non-adiabatic conditions in cryostat based on 
series of 3 measurements 


u(ORt4) — self-heating of resistance thermometer based on series of 10 measurements 
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u(dRys) — standard uncertainty due to AC bridge excitation 

u(ORo,) — standard uncertainty of achieving the equilibrium state at the triple point of 
water 0.1 mK (value given by CCT/01-02 document), calculation performed with a 
use of dR/dT = 0.1 Q/K resulting as 1E-5 Q 

u(ORo2,) — standard uncertainty of pressure measurement 

u(ORo,) — standard uncertainty due to temperature gradient inside the cell (evaluated in 
experimental way, by several trials of thermometer immersing based on series of 
5 measurements) 

u(ORo4) — self-heating of resistance thermometer (current was 2 mA), based on series of 
10 measurements (value estimated experimentally) 

u(ORos) — standard uncertainty due to AC bridge excitation 

u(ORo,) — Standard uncertainty due to wiring leakage. 


4. Budget of uncertainties 


The summarised uncertainties and their values, applied distribution functions 
plus sensitivity coefficients and contributions of each component are presented 
in Table 1. 


Tab. 1. Summarized uncertainties and their probability distribution functions (for triple 
point of hydrogen and triple point of water 


eof et eee TE 
uncertain distrib. | free. | coefficient tion 

| Xr |00011901 |_| 1168-07 | | rectang. | | 09795547 | _ | 1.131E-07 | 
a a Pe een 2 G OTST L L 501sE0. 
Xə | 1020872 | | 116E-07 | _ | rectang. | | -0.0011419 | | -1.319E-10| 
—— 0 o _} 2168-09} _} tSindent [19 | 20011419 | f 246E a 
a | 0) T1405] o | reang | | 4663605 | 170 | 6.730E-10 | 
| Rs | © |Q] 1.64E-07_ | Q | rectang. | | 4.663E-05 | 1/0 | 7.628E-12 | 
| os | 0 |Q] 2898-05 | Q | rectang. | _ | -4.663E-05 | 1/2 | -1.346E-09 | 
| Rss | 0 |Q] 164E-07 | Q | rectang. | | -4.663E-05 | 1/0 | -7.628E-12 | 
| Rn | 0 |Q | 109E-06 | @ | normal | _ | 0.03918174 | 1/0 | 4.263E-08 | 
| Rn | 0 |Q] 462E-08 | Q | rectang. | | 0.03918174 | 1/0 | 1.809E-09 | 
















Pkl 














ER | © |Q] 3366-09 | Q |tStudent| 2 | 0.03918174 | 1/0 | 1.316E-10 | 
| Rr | O |Q] 0.00E+00 | Q |tStudent| 9 | 003918174 [at 0 
| Rs | 0 [Q] 2178-06 | @ | rectang. | | 0.03918174 | 1/0 | 8.483E-08 | 
| Ry | O0 | 100E-05 | O | normal | | -4.568E-05 | 1/0 | -4.568E-10 | 
| Ry | O [at 4046-07 | Q | rectang. || -4.568E-05 | 1/0 | -1.846E-11 | 
Ro | 9 |Q] 500-06 | Q | tStudent | 4 | -4.568E-05 | 1/0 | -2.283E-10 | 
| Ros | OT 1108-06 | Q | tStudent | 9 | -4.568E-05_| 1/0 | -5.024E-11 | 
_ os | O0 |Q] 289-06 | © | rectang. | | -4.568E-05 | 1/0 | -1.318E-10 | 

Roo | 0 | Q| 7.22E-07 | Q | rectang. | | -4.568E-05 | 1/Q | -3.296E-11_ 
w |00011657] | fF | tf me |- | 1473E7 | 
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5. Calculation of expanded uncertainty 


5.1. Conventional method 
For conventional method coverage factor is assumed as k = 2 for confidence 
level p = 0.95 


5.2. Approximate method . 
The following value of coverage factor is assumed, depending on ratio, which in 
this case was defined as: 

lu, (y ) 


r ee a 


where: u, — geometric sum of all contributing components 
u; — the largest contributing component of type B. 


There 3 cases are considered as: 


a) k=ky for 0<r,< 1 (normal distribution) Ay =1.96 ( p=95% ) 


b) k=kr for 1< r, < 10 (trapezoidal distribution) 
where: k shr 2a) 
r +i 


c) k=kp for r, > 10 (uniform distribution) kp = “3 p. 


Details are described in the article titled: “A method of approximation of the 
coverage factor in calibration” by P. Fotowicz, accepted for publication in the 
journal Measurement 11 December 2003 


5.3 Convolution method 


Block diagram of expanded uncertainty calculation based on convolution 
method is presented in another paper in this book. In this case all contributing 
components are taken into account and k-factor is no needed to calculate 
expanded uncertainty. In such a type of consideration the term of expanded 
uncertainty should be replaced by term: coverage interval at p level of 
confidence. 

Example of one calculation of coverage interval based on convolution 
method in the form of graphical representation of convolutions of probability 
density functions of contributing components, and final convolution and its 
density function are presented in Figure 1. 
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Figure 1. Example of screen intermediate calculations of coverage interval: convolutions 
of probability distribution functions, and cumulative distribution. 


6. Comparison of results 


Fixed — Method! | Methodi | Method 









Hydrogen |0.001165768 | 2.00 _|2.96 E-07_|1.90 |2.80E-07 |1.90 |2.80E-07 
1.92 E-06 [1.96 |1.87E-06 |1.96 |1.87E-06 
3.02 E-06 2.96 E-06 
Neon | 0.008311424 | 2.00 _|5.19E-07 [1.96 |5.08E-07 |1.99 |5.17E-07 _ 


~~] 






7. Conclusion 


Although the results of calculation performed by method II and method HI are 
very similar the advantage of method III is clear; no subjective and rather 
sophisticated deliberations are needed. 
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The main purpose of the paper is to present how easily and clearly the result of 
measurements can be accompanied by coverage interval calculated at any 
desired confidence level in Virtual Instrument, if an appropriate procedure is 
implemented in instrument, or if instrument is equipped with additional 
processor, which handles measured data as a series of values, calculates 
coverage interval, which covers the true value at a certain level of confidence. 


1. The challenge 


The new way of data presentation in virtual or any other instruments proposed 
here is: each displayed result of measurements is accompanied by a parameter 
characterizing its dispersion in a way attributable to the measured value with a 
certain degree of confidence [2, 4]. The desired confidence level is declared by 
a person performing the measurements. 

Such a presentation can be implemented directly in virtual instruments as 
only a small amount of software, presented here, needs to be added to the 
overall software. A specialized processor can play the role of this software. 

This method fully describes the coverage region, which covers the true 
measured value. 


2. The method of uncertainty calculation 


Most contemporary measurement instruments are digital, thus block 
diagram representing their structure can be recognized. The PC station presented 
in Fig 1 is typical for virtual instrument. In other digital instruments there are 
elements which play similar role from functional point of view. 


” Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
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LabView 
based 


uncertainty 
calculation 
subV! 





Figure 1. Block diagram of a virtual instrument equipped with Multimeter PCI card. 


The algorithm of developed software is based on idea presented in block 
ree depicted in Fig. 2. 


: U,(uniform) 








f(error) 





convolution 


‘cu . 
FF] 
BRR 
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Figure 2. Block diagram of developed software for calculation coverage interval with a 
p — level of confidence, FFT — Fast Fourier Transform, II — Product, IFFT — Inverse Fast 
Fourier Transform, c; — sensitivity factors, ua — uncertainties of type A, ug — uncertainties 
of type B 


Parameters of any A/D input card, or any sensor, any transducer, or converter 
stated in the technical specification are the input data for software. The desired 
level of confidence for coverage interval is declared by the experimenter, 
performing measurements. Additionally a number of individual sample 
measurements, i.e. a number of repeated readings for internal calculation of 
mean, and parameters to calculate type A uncertainties in the elaborated 
software, is also declared by experimenter. The parameters and the type of 
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probability distribution functions, describing the digital and analogue accuracy 
of the instrument component, must be supplied by hardware producer. The 
probability distribution function for repeated individual readings, are assumed to 
be randomly distributed according to t-Student distribution. 


3. The results 


The three examples are the images of virtual user panel of the virtual instrument 
elaborated using LabView (NI license) 










Input 
information 


Prabalntly Dasraty Furnas 


Figure 3. Human-Machine Interface of Virtual Voltmeter with 30 declared observations 
and results of type A and type B uncertainties calculation. 
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Figure 4. Human-Machine Interface of Virtual Voitmeter with 300 declared observations 
and results of type A and type B uncertainties calculation. 


4. Conclusions 


The new way of data presentation will not leave any doubts to any 
experimenter, where the true value is located. 

At today’s state of technology it is not necessary to avoid quoting the 
uncertainty of measurement results with LCD display. 

All digital instruments can be equipped with a programme or specialized 
processor, which calculates the uncertainty at any required level of confidence. 
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Figure 5. Human-Machine Interface of Virtual Voltmeter with 1000 declared 
observations and results of type A and type B uncertainties calculation. 
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A HYBRID METHOD FOR ¢; APPROXIMATION” 


D. LEI, J. C. MASON 
School of Computing and Engineering, 
University of Huddersfield, Huddersfield, HD1 83DH, UK 
Email: d.lei@hud.ac.uk 
j.c.mason@hud.ac.uk 


Linear programming (LP) techniques and interior-point methods (IPMs) have been 
used to solve £; approximation problems. The advantage of the IPMs is that they 
can reach the vicinity of the optimum very quickly regardless of the size of the 
problems, but numerical difficulties arise when the current solution is approaching 
optimal. On the other hand, the number of iterations needed for an fı approxi- 
mation problem by LP techniques is proportional to the dimension of a problem. 
However these LP methods are finite algorithms, and do not have the problems 
that the [PMs endure. It would be beneficial to combine the merits of both meth- 
ods to achieve computational efficiency and accuracy. In this paper, we propose 
an algorithm which applies the IPMs to get a near best solution and fine-tune 
the results by one of the simplex methods. The savings in terms of numbers of 
iterations can be substantial. 


1 Introduction 


A linear discrete 2, approximation problem can be stated as: Given a set of 
discrete data points {(z:,y:)}™,, determining a set of parameters c € R”, 
which can minimize the following term 


m n m 
min ||y — Ac||; = `; Yi — X aije = >. [ril 
j i=] j=1 i=1 


where A is the observation matrix, A € R™*", m > n and ri = yi—} j-i 44,565. 
The £, approximation problem has been successfully solved by linear pro- 
gramming techniques, the Barrodale and Roberts method * is the usual al- 
gorithm of preference. However, it has long been known that the number of 
iterations required by a simplex-based method tends to be linear in the problem 
size, and in the worst case, it could grow exponentially with the problem. 
Several authors have tempted to solve the £; approximation problem by the 
interior-point methods. For example, Ruzinsky and Olsen proposed a simple 
iterative algorithm for 2; approximation problem? on a variant of Karmarkar’s 


2”Work partially funded under EU Project SofTools_MetroNet Contract N. G6RT-CT- 
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linear programming algorithm 3. One of the salient features of the IPMs- 
based approach is that the objective function decreases rapidly in the first few 
iterations, but the convergence process slows down when the current solution is 
approaching the optimum. Solving ı approximation problem by IPMs could 
be computationally expensive when exact solutions are required. In this paper, 
we shall introduce a hybrid method for 2; approximation. It is a combination 
of the central and boundary search schemes, which starts with an interior- 
point method and switches to a simplex-based approach at an appropriate 
point. The effectiveness of the proposed method can be shown by various 
experimental results. 


2 The hybrid method 


The hybrid method starts with an interior-point method — Ruzinsky and 
Olsen’s algorithm ĉ. It considers the 4; fitting problem in the dual form of 


maximize uly —vly 
subject to Atu—Altv=0 


utv=e 


(1) 


where e is a unit vector with all entries equal to 1. 

The major computational work in each iteration of an IPMs involves solv- 
ing a weighted linear least squares approximation problem. Ruzinsky and 
Olsen investigated the characterization of an £, solution and proposed the 
following weighting scheme: 


Us Us 
i= TEE and W = diag(w). (2) 


2,3,2 
i “i 


where W is the weighting matrix. 
The algorithm terminates when 
tee (3) 
doi lri] 


where ¢; = 7447; and € is a predetermined tolerance value. 
The author claimed that such a stopping rule is well behaved especially in 
the presence of roundoff errors. In our numerical experiences, it terminates the 


computation when (3) is satisfied for some predetermined e. But one important 
observation we have noticed is that even if the Ruzinsky and Olsen’s algorithm 
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converges to the required accuracy, which does not imply the actual solution 
is within the stated accuracy. 

Having obtained a near best solution, we apply Barrodale and Roberts 
algorithm! to associate this interior point with a vertex. Assume that readers 
have sound knowledge of the Barrodale and Roberts algorithm. The hybrid 
method can be outlined as follows: 


1. Set a rough accuracy tolerance value of e = 0.1, apply Ruzinsky and 
Olsen’s algorithm to get a near best solution u and v. 


2. Compute the corresponding coefficient parameters c by solving the 
weighted least squares problem WAc = Wy, where W is constructed 
by (2). Obtain the residual vector by r = y — Ac. 


3. Fit the residuals r by the Barrodale and Roberts algorithm to obtain a 
correcting approximating function with correcting coefficients dc. 


4, Compute the optimal solution c* = c+ dc. 
The mathematical justification of the hybrid method can be derived as 


ly — Ae*||1 = lly — A(e + 6c)|l1 
= ||r — Ade||1. 


Numerical experiments have been carried out to test the effectiveness of 
the proposed method. Some of the results are rather striking, the number of 
iterations required by the Barrodale and Roberts algorithm decrease dramati- 
cally by using the results from the IPMs. 


3 Conclusion 


The proposed hybrid method has incorporated the IPMs and LP techniques, 
which builds a bridge between these two important optimization techniques. 
This pilot study indicates a possibility of solving large 2; approximation prob- 
lems by taking advantages of both methods. 
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INTERPOLATION EQUATIONS FOR INDUSTRIAL PLATINUM 


RESISTANCE THERMOMETERS. 


P. MARCARINO, P.P.M. STEUR AND A. MERLONE 
CNR Istituto di Metrologia “Gustavo Colonnetti”, Torino, Italy 


The Callendar — Van Dusen (CVD) equations of the JEC-751 for the Industrial Platinum 
Resistance Thermometers (I[PRTs) do not exactly follow the International Temperature 
Scale of 1990 (ITS-90), within the required accuracy. Indeed, there is a demand for 
calibrations of IPRTs to uncertainties of around 10 mK for temperatures from 0 to 
250 °C and better than 0.1 K between 250 °C and about 500 °C, while the use of the CVD 
equations do not allow an interpolation uncertainty better than + 0.2 K over a range 
larger than 0 - 250 °C. To solve this problem, two new reference equations, one below 
0 °C and the other above 0 °C, are proposed to be used instead of the CVD equations as 
reference for the IPRTs. These equations will be of a higher order than the CVD 
equations and their lower order coefficients are equal to the constants A, B and C of the 
CVD. The use of these new reference equations allows an interpolation accuracy at the 
millikelvin level, limiting the number of calibration points in every range to five. 


Introduction 


Before 1968, the Callendar- Van Dusen (CVD) equations 
for?<0°C 


sea flt att Br +C(t-100 °C) | 


0 


fort >0°C 


described the resistance-temperature relation of both standard and industrial 
platinum resistance thermometers, indicated respectively with SPRTs and IPRTs 
[1,2]. Since 1968, the CVD equations were used only for the IPRTs and their 
agreement with the SPRTs realizing the temperature scale was to be verified 


Saft at B-P] 


0 


[3, 4, 5, 6]. 


From 1990, with the introduction of the ITS-90 [7], temperature fọ was 
defined by means of a reference function, a logarithmic polynomial equation of 
12" order below 0°C and of 9" order above 0 °C, and a quadratic linear 
deviation function, the constants of the latter being determined by calibration at 
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the fixed points. The differences between the ITS-90 and the CVD equations are 
shown in Figure 1. 


Callendar-Van Dusen - ITS-90 
0,4 
0,3 
0,2 


0,1 


Difference / °C 


-400 -200 0 200 400 600 800 1000 


Figure 1. Differences between the CVD equations and the ITS-90 in the temperature 
range from —200 °C to 850 °C. 

Many laboratories are using platinum resistance thermometers that do not 
satisfy the requirements of the ITS-90 as working standard for temperature 
calibrations. The working standard thermometer needs to be inexpensive and to 
exhibit very good stability, but only in the operating range of its bath or furnace. 
Thermometer calibrations can be carried out either directly by comparison with 
the standard reference thermometer (for best accuracy) or, as for routine 
calibrations, by comparison with the working standard only. 

The calibration accuracy depends on the stability of the working standard 
between subsequent calibrations, and also on the capability of the interpolating 
equation to approximate the ITS-90. As is evident from Figure 1, the CVD 
equations are completely inadequate at the level of 0.01 °C. 

In the mean time, calibration service laboratories have improved their 
capabilities considerably and some have been recognized at the level of 0.03 °C 
in a limited range, and it is to be expected that their capabilities will continue to 
improve. Evidently an alternative, adequate, interpolation equation is required. 


2. The proposed reference functions 


The ITS-90 reference function has been fitted with two functions, one for 
t<0 °C, between —200 °C and 0 °C, and one for t>0 °C, between 0°C and 
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850 °C, maintaining for the lower orders the form of the CVD equations. They 
are of the form: 


fort <0 °C 
12 | 
R fis Are B +C(1-100°C) ÈD (3) 
0 5 
fort >0°C 
R 11 
Balitar Be HYG | (4) 
R; 3 


where the coefficients 4, B and C are those of the CVD equations and the 
coefficients D; and C; have been calculated in order to obtain fits with residuals 
less than 1 mK over all of the range. All coefficient values are reported in Table 
I. 


Table I The coefficients for the two proposed reference functions, eqs. (3) and (4). 


Bo 5 7ISE-T |B -5.TISE-T 
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3.21180E-22 |Cə |  -1.24395E-25 


7.21836E-25 6.03317E-29 
6.94960E-28 -1.26471E- 
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3. Application of the reference functions to actual calibration data 


The reference functions obtained above can be fitted to the IPRTs calibration 
data in two ways: (a) by calculating the A, B and C coefficient values of eqs. (3) 
end (4), where all coefficients D; and C; are constants, in the same way as for the 
CVD equations; (b) or by using a 2™ order deviation function with respect to 
eqs. (3) and (4) as reference functions, in the same way as for thermocouples. 
The described interpolating methods have been applied to the same 
calibration data from SIT laboratories (Italian Calibrations Services) as used in 
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Figure2 The residuals of the straightforward application of the CVD equations to the calibration 
data of 38 working standard thermometers of SIT laboratories. 


equations, while Figure 3 shows the residuals of the application of the new 
reference functions. 


The improvement in residuals from Figure 2 to Figure 3 is more than 
evident, while the remaining scatter mirrors the calibration capabilities of the 
single laboratories. Therefore the adoption of the reference functions proposed 
in this work satisfies the increasing demand for calibration accuracy of IPRTs to 
the Calibration Services and will leave ample space for improvement in 
measurement capability. 


4. Conclusions 


Two new reference functions, one below 0 °C and the other above 0 °C, are 

proposed as substitutes for the CVD equations as standard functions for the 

IPRTs. The new reference functions have the following advantages: 

1) maintain for the lower order coefficients the same constants A, B and C 
of the CVD; 

2) the higher order coefficients are constant for all IPRTs and allow an 
agreement with the ITS-90 well within | mK; 

3) only the constants A, B and C are to be calculated from the calibration 
points, that can be then limited to five over all ranges. 


322 


References 


. Comptes Rendus, Onziéme Conf. Gén. Des Poids et Mesures (Paris, 1960). 


J. Research Natl. Bur. Standards, 1961, 65A, 139. 


. IEC — Industrial Platinum Resistance Thermometer Sensors, ZEC Standard 


Publication 75] (Bureau Central de la Commission Electrotechnique 
Internationale, Geneva), Amendment 2, 1995. 


. Connolly J.J., In Temperature: Its Measurement and Control in Science and 


Industry, Vol. 5 (Edited by J. F. Schooley), New York, American Institute 
of Physics, 1982, 815-817. 


. Actis A. and Crovini L., In Temperature: Its Measurement and Control in 


Science and Industry, Vol.5 (Edited by J.F. Schooley), New York, 
American Institute of Physics, 1982, 819-827. 


. Crovini L., Actis A, Coggiola G., Mangano A., In Temperature: Its 


Measurement and Control in Science and Industry, Vol. 6 (Edited by J.F. 
Schooley), New York, American Institute of Physics, 1992, 1077-1082. 


. P.Marcarino, P.P.M.Steur, G.Bongiovanni, B.Cavigioli, 2002: “ITS-90 


Approximation by means of non-standard platinum resistance 
thermometers”, Proc. TempMeko2001 (ed. B.Fellmuth, J.Seidel, G.Scholz), 
Berlin (Germany), 85-90. 


. The International Temperature Scale of 1990, Metrologia, 1990, 27, 3-10. 


Advanced Mathematical and Computational Tools in Metrology VI 
Edited by P Ciarlini, M G Cox, F Pavese & G B Rossi 
© 2004 World Scientific Publishing Company (pp. 323-326) 


FROM THE FIXED POINTS CALIBRATION TO THE CERTIF ICATE: 
A COMPLETELY AUTOMATED TEMPERATURE LABORATORY 


A.MERLONE, P.MARCARINO, P.P.M. STEUR, R. DEMATTEIS 


Istituto di Metrologia “G. Colonnetti”, IMGC-CNR, Strada delle Cacce 73, 10135 
Torino, Italy 


E-mail: a.merlone@imgce.cnr.it 


At the Intermediate Temperature Laboratory of the Istituto di Metrologia “G. Colonnetti” 
(IMGC) all operations concerning the calibration of standard platinum resistance 
thermometers (SPRTs) at the ITS-90 fixed points are now completely automated. The 
software is based upon Visual Basic© modules, forms, macros, I/O connection and 
dialogs with the main Microsoft® Office© application. These modules and other “.exe” 
files are also used for research purposes as they can be separately used as virtual consoles 
for data acquisition, pre-processing and post-processing. Several blocks call on each 
other, starting from the data acquisition to the certificate printout: those blocks can be 
used as useful tools for accredited calibration laboratories. Statistics, data input 
evaluation, cross controls, ITS-90 requirements and equations, fixed points and 
thermometer databases are in continuous dialog, till the end of the calibration procedure. 
All the data, both from the calibration and the research activities is weekly saved 
automatically. 


1. Introduction 


The Intermediate Temperature Laboratory of the Istituto di Metrologia “G. 
Colonnetti’” (IMGC) is responsible for the ITS-90 [1] maintenance and the 
calibration of SPRTs at the fixed points [2]. The laboratory performs SPRT 
calibrations for research activities, for the contact thermometry laboratory of the 
IMGC calibration service and for several calibration laboratories of SIT (Italian 
Calibrations Services). All the calibration procedures, from the resistance 
measurements to the certificate issue, are now completely automated. 

The programs are continuously running, twenty four hours a day, in order 
to keep under strict control all the single parts involved in the whole system. 
During those times free from calibration or research activities several noise 
analyses, instruments stability measurements and I/O tests are performed. 

Each single fixed point calibration takes into account all the requested 
procedures such as the temperature stability, self-heating and the standard 
deviation of a pre-set number of readings. The whole measurement sequence is 
automated. The operator just pushes a virtual button and all the data start to be 
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acquired, evaluated, displayed and saved. The calibration procedures are now 
easier and quicker and the whole system is flexible to several kind of 
requirements, from research to industrial PRT calibration. The software has 
been completely developed at IMGC. Several blocks can be used in future as 
validated software by metrological laboratories for accredited temperature 
measurements. 


2. The procedure 


As the SPRT is inserted in the thermometer well of the fixed point cell, the 
complete process for the calibration is started. Many virtual consoles take part 
in the data acquisition, from the instruments connected to the SPRT. The 
implementation of object-oriented systems has led to the realization of those 
ActiveX modules specialized in the management of instrumentation, that show 
the same interface to the outside world. These consoles allow a complete control 
of all the instruments involved in the calibrations. The user-interfaces that have 
been developed reflect the potential of the instruments, with useful functions for 
the user, directly available through list-boxes, check-boxes and command- 
buttons, as well as the potential of the software, that allow the totally automated 
operation of any sequence. The instruments are all connected to PCs and all the 
I/O information fluxes are recorded and controlled. 

The calibration uncertainty is automatically evaluated taking into account 
the stability of the thermometer at the triple point of water before and after each 
fixed point determination, the fixed point reproducibility, ITS-90 specifications 
and all the other A-type and B-type uncertainty components; several subroutines 
are involved in those processes. A preliminary statistical analysis of the data 
acquired is evaluated in real time by the virtual consoles that drive the 
acquisition instruments. Any malfunctioning is immediately evaluated and the 
operator can decide whether to repeat the calibration or to stop it. Next, the 
series of fixed point calibrations for the required ITS-90 sub-range are stored in 
the data-acquisition database that is used both for emission of the final 
certificate and as a complete record of the “history” of all thermometers. 

Several Excel? macros are automatically recalled from the procedure and 
are used to evaluate the calibration constants from the ITS-90 requirements, 
considering the temperature range and the related temperature fixed points 
involved in the specific thermometer calibration. 

The hydrostatic temperature correction is automatically taken into account, 
considering the cell used and the thermometer dimensions. The information 
regarding the fixed point cells are those evaluated during the CCT Key 
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Comparisons and used for the CMC [2]. They are automatically recalled from 
the specific database. 
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Figure 1. Schematic design of the blocks involved in the whole calibration process 


The processed data is then sent to a Word? document that uses the 
Institute’s model for the calibration certificate. This model is then automatically 
compiled with all the data requested, then printed. This will then be the 
calibration certificate, ready to be signed and sent with the SPRT. The complete 
procedures, the I/O devices and sequences, the data acquisition modules and the 
databases are continuously controlled and cross-checked. The whole procedure 
is a part of the laboratory quality system, as well as all the instrument and 
temperature standard procedures. The developed software involves the main 
Microsoft® Office? applications and dedicated data-acquisition modules [3], 
written in Visual Basic®. Those modules are also used for research purposes as 
they can be separately used as virtual consoles or acquisition modules. They 
represent useful tools for a metrological laboratory in the field of temperature 
data acquisition and can be validated as single parts for all the requirements of 
the ITS-90 specifications. 
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Figure 2. System logical diagram. 


The structure that has been realized is therefore made up of a number of 
servers specialized in managing instruments (remote consoles) and some client 
programs specialized in the realization of specific functions. Once the modules 
for communication with the instruments have been created, the management 
programs become users of the service supplied by these same modules and 
represent in synthesis the requirements of the user; in this aspect they can be 
developed with those instruments that are most apt for their use. 

The Word? document contains the evaluated calibrations constants, each 
fixed point W value measured, dW/dT tables (see ITS-90 specifications) and the 
associated uncertainty value and curves. All data, both from the calibration and 
the research activities is weekly saved automatically on two different backup 
directories on two separate computers. 
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A new off-line gain stabilisation method is applied to high-resolution alpha- 
particle spectrometry of U. The software package SHIFTER automatically 
identifies and quantifies gain shift for intermediate spectra or even individual 
data in list mode files. By reversing the gain shift before combining all data 
into one sum spectrum, one can optimise the overall resolution. This automatic 
procedure is very useful with high-resolution spectrometry at low count rate, as 
a compensation for gain drift during the long measurement. 


1. Introduction 


A limitation of alpha-particle spectrometry arises from spectral interference 
caused by the presence of nuclides with near identical alpha-emissions, which 
appear as unresolved or partially resolved multiplets in the measured energy 
spectrum. As the optimum resolution requires thin radioactive sources, small 
detectors and relatively high source-detector distances, one needs long 
measurement times to obtain the required statistical accuracy. Hence, electronic 
gain shift with time can significantly influence the overall resolution. 

In this work, we apply a new off-line gain shift correction method [1] to a 
series of 150 sequential high-resolution alpha-particle energy spectra (and the 
corresponding list mode data files), taken over a period of six months. The 
‘Stieltjes integral’ of the measured spectra with respect to a reference spectrum 
is used as an indicator for gain instability. ‘Exponentially moving averages’ of 
the latter show the gain shift as a function of time. With this information, the 
data are relocated on a spectrum-by-spectrum or a point-by-point basis. 

Advantages of this method are the flexibility involved with the off-line 
analysis, allowing for optimisation by iterative processes, and the optimal use of 
scarce information inherent to measurements at low count rate. It is an 
alternative to other correction techniques like e.g. an electronic gain stabiliser 
[2], which requires a distinct energy peak with sufficient count rate, or a manual 
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shift of individual spectra by a discrete number of channels, based on a visual 
check of certain peak positions. 


2. Gain stabilisation method 


In the case of alpha-particle spectrometry, most radionuclides are long-lived 
and the spectral shape can be assumed constant throughout the measurement 
campaign. The relevant alpha peaks for *°U are situated between 4.15 and 
4.6 MeV, corresponding to only 10% of the energy range of the measured 
spectrum. Hence, a modest gain instability results in a shift of the region of 
interest by practically the same amount of channels. The human eye recognises 
gain shifts by the displacement of spectral features (peaks, valleys) with respect 
to their position in a previous spectrum, acting as a reference spectrum. Off-line 
gain stabilisation implies quantification and counteraction of the shift for each 
measured spectrum (or data point). 

Mathematically one can align two spectra e.g. by maximising their mutual 
convolution. A more convenient way of estimating the shift is based on the 
Stieltjes integral of a spectrum, fipiftea(x), With respect to the reference spectrum, 
f(x): 


b 
[Enina tdr- 1] 96)? -a| 
shift = 7 


; b b (1) 
- fœ dx fits / [tc de 


The second term in the numerator is negligible if the integration is performed 
between two points where the count rate is extremely low, i.e. f(a) = f(b) = 0. 
The alpha particle spectra are taken with high resolution (8K channels), so that 
the spectrum f(x) is a continuous and slowly varying function from one channel 
to the next. The Stieltjes integral combines the number of counts in each 
channel with the first derivative in the corresponding channel of the reference 
spectrum. The predominance of counts in channels with a negative f’ (x) value 
is interpreted as a drift in the forward direction, and vice versa. The 
denominator acts as a normalisation factor, based on the consideration that f” (x) 
= fihiftea(x)-f(x), if the shift is exactly one channel, i.e. fgnittea(X) = f(x+1). 

If data are collected one-by-one in list mode, the shift can be followed as a 
function of time and an individual correction can be made for each count. The 
local shift is then obtained from an ‘exponentially moving average’: 
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b b 
shift = f (x)| - | f(x) dx fe (x)? dx | n + shift previous (l-n 2) (2) 
a a 


in which n is a number representing the ‘memory range’. A more stable trend 
line is obtained by analysing the data in the backward as well as the forward 
direction, as local effects cancel out in the average and global changes are 
detected with less delay. 


3. Experimental result 


Figure 1 shows a fraction of the summed 7U alpha-particle spectrum, with 
and without gain shift correction. The software package SHIFTER (Version 1.1) 
automatically corrected for gain shifts between —5 and +5 channels, but needed 
some initial aid to correct for a shift by 20 channels when a new detector was 
taken into use. The best result was obtained with point-by-point shift correction. 


sum of spectra, no correction 
spectrum-by-spectrum shift corrected 
point-by-point shift corrected 


data from 
new detector 


number of counts 





5970 6000 6030 6060 
channel number 


Figure 1. Gain shift corrected and non-corrected sum of 150 alpha spectra. 
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In recent years, a large number of programs have been equipped with ANOVA 
(Analysis of Variance) functions. The expression of expectation of variance in 
ANOVA must be calculated in order to evaluate each standard uncertainty. 
However, modern software does not yet offer the functionality to calculate this 
expression of expectation of variance. In this study expectations of variance in 
ANOVA were formulated and a new program that calculates the expression of 
the expectation of variance in typical and specific experimental designs and 
displays symbolic expectations of each variance was developed. 


1. Introduction 


There are two methods used in estimating the uncertainties of multiple factors. 
One focuses on a single factor, leaving levels for other factors fixed. This 
method is called a one-factor experiment. The other method uses combinations 
of every possible level for all factors. This method is called a full factorial 
experiment. The effects of a full factorial experiment are calculated by ANOVA, 
which is a tool often used in metrology. Annex H5 in GUM [1] is a first 
example of estimating uncertainty using ANOVA. ANOVA is used to estimate 
the effects of both within-day and between-day variability of observations. In 
recent years, a large number of programs have been equipped with ANOVA 
functions. However, ANOVA features of such software are designed mainly for 
F-tests and to determine the level of significance between each factor and error 
factor. Specifically, the expression of the expectation of variance of each factor 
must be calculated in order to calculate the variance of each factor. However, 
this kind of software is not yet equipped with a feature that allows calculation of 
the expression of expectation of variance. 
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This research focused on formulating expectations of variance in ANOVA 
[2] and developing a new program that calculates the expression of the 
expectation of variance in typical as well as specific experimental designs and 
displays symbolic expectations of each variance. This software greatly helps 
reduce work required in uncertainty evaluations. 


2. Algorithms used in calculating expectation of variance 


2.1. Completely randomized design 


Completely randomized design is the method used when each piece of data 
measured is taken in random order. The following equation shows the 
expression of expectation of variance of factor B out of factors A, B, and C in a 
three-factor repetition experiment. 

Bio? + epecnO rupee tC acy + ecbnog,c + ACNOR (1) 


Here, e means that if factor A is a parameter, 0 is substituted for e, and if 
factor A is a random variable, 1 is substituted for ea. The level numbers are 
expressed by small letters of denotation and n is expressed by the number of 
repetitions. In Fig.l we see an example of output made by our newly developed 
software. This is a three-factor repetition experiment in which factor A has 3 
levels and is a parameter, factor B has 4 levels and is a random variable, and 
factor C has 5 levels and is also a random variable. Number of repetitions is 10. 
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Fig.l: Example of a completely randomized DesignAcknowledgments 
2.2. Split-plot design 


In a split-plot design, measured data is not taken in a random order but 
randomization is performed hierarchically. The following equation shows the expression 
of expectation of variance of: firstly, factor A when factor A is (parameter, a levels, 
number of repetition: n times), and secondly B when it is (parameter, b levels) in a split- 
plot design. 

Atat ba2, + egna? p + bno? (2) 


Fig.2 shows an example of output by our software. This is the expectation 
of variance in factors A (parameter, 3 levels, n=10) and B (parameter, 5 levels). 
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Fig.2: Example of split-plot design 
2.3. Nested design 


Nested design is one method used when factors are random variables. The 
following equation shows the expression of expectation of variance of factors A, 
B, and C in a nested design. 

A: OCB) + CO ay + bes B: ord AB) + COR A) Cc oé AB) (3) 
Fig.3 is an example of output produced by our software. This is the expectation 
of variance in factors A (3 levels), B (5 levels), and C (2 levels). 
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Fig.3: Example of nested design 


3. Conclusions 


In this research, we produced software that automatically calculates not 
only each variance but the expression of the expectation of variance for a full 
factorial experiment as well as in split-plot and nested designs. This software 
works as an add-in for Microsoft Excel. In the future there are plans to enable 
free download of this software. 
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Passive sonar records the sound radiated by a target. A signal corresponding to 
a vessel is called a frequency track. A single noise source typically produces 
several frequency tracks, which collectively form a harmonic set. Often, there 
are several detectable harmonic sets, and so a key problem in passive sonar is 
to group the frequency tracks into their corresponding harmonic sets. This 
paper describes a novel method of identifying the different harmonic sets using 
a combination of data fitting and template matching. 


1. Introduction 


This paper describes an algorithm for associating into groups frequency 
tracks recorded using passive sonar. A frequency track is a noise signal 
corresponding to a target, typically there are several tracks associated to each 
noise source. Each group of frequency tracks is called a harmonic set, and a 
harmonic set provides vital information to a sonar operator in classifying a 
target. This paper uses the magnitude FFT data from the measured noise, which 
gives frequency and amplitude information over time. The frequency tracks in 
the harmonic set have the property that if the lowest frequency track (which is 
called the fundamental) has a frequency f(t) at time ¢, then all other tracks have 
a frequency Af (t) , where å is a positive integer value. 


The algorithm described in this paper uses a technique known as template 
matching [2] in order to construct the harmonic sets, but also exploits the known 
physical behaviour of frequency tracks belonging to a harmonic set. Typically, 
template matching is used only to compare how closely two objects match, but 
here it is also necessary to make a decision as to whether the frequency tracks 
should be associated. 


” Work partially funded under EU SofTools NetroNet Contract N° G6RT-CT-2001-05061. 
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2. Analysis Procedure 


Figure 1 outlines the analysis procedure of each batch of data. Template 
matching [2] is used to provisionally group frequency tracks into harmonic sets. 
The first step is to form a template, which is achieved by fitting a B-spline g(t) 
[1] to a frequency track. A simple knot placement algorithm that places knots at 
every nth data point is used, where the number of knots placed determines n. 













Calculate 
Harmonic 
number of 
tracks 


Figure 1 Overview of analysis procedure for each batch of data. 


The physics governing harmonic sets mean that the only transformation 
required to successfully match tracks from the same set is a scaling in 
frequency, defined as S. Each data set, representing a frequency track can be 
expressed as 


(fot dhe =(F.4), (1) 


where f; represents the frequency of the data at timet,. Template matching 


transforms ( f ,£) to a new data set ( f ,f) in the same region as g(t), where 


f=Sf. 
The quality of match is measured in the lL-norm, so that the optimum match is 
found by solving the least squares minimisation problem 


min| =(f-g(t)) (F - g(t). (2) 


The value of S that achieves this minimisation is found by differentiating E with 
respect to S and setting the result equal to zero. This yields the result 

fS=g(0), (3) 
which can be solved using standard linear least squares algorithms. 

A decision is required as to whether a positive match has occurred, since 
tracks may be present that do not match the template. The decision is made 
using the value of the residual error E, if E is less than some user-defined 
threshold then the match is found to be positive, and the track is provisionally 
assigned to the harmonic set. If several harmonic sets exhibit similar 
characteristics then this technique can result in tracks being incorrectly assigned 
to a set. The physics governing harmonic sets mean that there is a relationship 
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between the scaling factors, S, calculated for the frequency tracks in the set that 
can be exploited to identify any rogue tracks. The scaling factor, S, between the 
template and the nth harmonic in a set isS =n/N , where N is the harmonic 
number of the template. The difference in scaling factors of sequential 


harmonics is constant and equals 
Syna Sya =+) N -ni N=1/N. (4) 


Assuming there are more correctly assigned tracks than rogue tracks in the 
harmonic set, the value 1/N, termed the difference in scaling factors 6 , can be 
found by calculating the differences in scaling factors for every pairwise 
combination in the set. The most frequent value within a given tolerance will 
equal dé. All tracks with another track with a valued between their scaling 
factors are identified as harmonics of that set, all other tracks are discarded. 


3. Numerical Results 


Figure 2a. shows a grayscale plot of a set of synthetic frequency data. At any 
given time, the brightness of the pixel image increases as the amplitude of the 
frequency increases. A frequency track is distinguished by high amplitude 
frequency data, and so from Figure 2a it can be seen that there are three 
different harmonic sets and corresponding frequency tracks. Figure 2b. shows 
the harmonic sets that have been formed for the data set. In the figure, 
frequency tracks belonging to different harmonic sets are distinguished by being 
plotted with different line styles. It can be seen from Figure 2b that the 
algorithm successfully constructs the harmonic sets. 
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Figure 2 a.) Sample batch of data. b.) The resulting harmonic sets. 
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We describe a method for the rapid calculation of an interval covering approx- 
imately a specified proportion of the distribution of a function of independent 
random variables. Its speed will make it particularly valuable in inverse problems, 
The first four moments of the output variable are obtained by combining the mo- 
ments of the input variables according to the function. A Pearson distribution is 
fitted to these moments to give 95% intervals that are accurate in practice. 


1. Introduction 


Suppose that a known function F relates a measurand to n influence quan- 
tities. Let X1,...,Xn be variables with the distributions assigned to the n 
input quantities and let Y represent the measurand, so Y = F(X1,..., Xn). 
This paper describes a rapid method of obtaining an interval enclosing ap- 
proximately a specified proportion, say 0.95, of the mass of the distribution 
of Y. The method can serve as a complement to the method of Monte Carlo 
simulation, and may be particularly valuable in inverse problems where an 
interval is specified and a distribution is sought. 


2. Moments and cumulants 


The method involves calculation with the moments and cumulants of 
the variables. The rth moment about the origin of any variable X is 
H (X) = E[X"] and the rth central moment is u,(X) = E(X — u4)". 


338 


These are related by the equations! 
Br = S3j=0(-1) G) (H1) Mey 
H, = no (G) (H1) Hr- 


The mean ys, and variance u2 determine location and scale, while u3 and 
jig add information on skewness y 61 = p3/ u?’ * and kurtosis Bo = pa/p3. 
Cumulants {xr} are functions of moments, and vice versa.! They are 


related by? 
Kr = pr — Dyna (3 Hsr r22 
Hr = Kr + DE (7 ) Mj Kr r> 2. 


In particular, kı =p, K2 = M2, K3 = H3, K4 = p4 — 3p, 


3. Moments of input distributions 


The symmetric input variables listed in Table 1 are the arc-sine variable on 
the interval from —1 to 1, A[—1,1]; the triangular variable, T[—1, 1]; the 
uniform variable, U|—1, 1]; the standard normal variable, N(0,1); and the 
Student’s t-variable, tp. Exponential variables can also be used. 


Table 1. Moments of symmetric input distributions. 


Input variable | Moment wr (r even) | Kurtosis 82 


1-3...(r—1) 
2-4...7 


1/(r +1) 


2 
(r+1)(r+2) 


r! 
27/2 (r/2)! 
v?/2.1.3.,.(r—~1) 
(v—2)(v—4)...(v—r) 





4. Finding the moments of the output 


To obtain the moments of Y we break up F into elementary operations 

+, —, X, +, exp, log etc., and carry out moment calculations for each op- 

eration, while converting between moments and cumulants as required. 
Arithmetic operations on independent variables X, and X; obey 


Kr(aXp + bX) = a  Kp( Xp) +b Kp (Xj) 
Ur (aXn Xi) = a" u,(Xn)u, (Xi). 
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Moments about zero quickly become large if the mean is distant from zero 
and, consequently, round-off errors dominate in sums of large oscillating 
terms. So the higher moments of a product are actually obtained by ex- 
pressing p,(aX,X;) in terms of the means and central moments, 


rl k-j XV ur Xira 
urla Xn X) =a" Yo Jo TAO ee Eerst, 


which is obtained by writing X, and X; each as its mean plus a deviation. 

Let u4 and uy denote p(X) and u,(X). For the reciprocal and other 
functions f(X) we have the Taylor expansion f(X) = f (u1) + 072) cy 8? 
where c; = fO) (u4)/j! and 6 = X — u4. Analysis gives 


ey (f(X)) = f(m) — ¢ b= yao jhj 
Ur(f(X)) = 11005, <0 PL(r — q0)/40! 


where 
r—qo T—Go—-Qi T--Go-Gi—- 42 foe) cU foe) 
-wD L(G} ue | Lae 
qı=0 q2=0 q3=0 j=0 


with Q = para jqa;. The c; values are easily found for the common func- 
tions f(X) = 1/X, VX, 1//X, X?, log X, expX, cosX, sinX. The 
greatest practical difficulty in the calculation of u,(f(X)) is in the calcula- 
tion of L(r — qo), which, in the general case, is a convergent sum of infinite 
dimension. An algorithm for this has been obtained. 

If X is uniformly distributed between the positive limits a and b then 
exact results are 





u(n X) = Dho(—1)* pty - be been a (Ref. 3, Eq. 2.722) 
i, (e*) = pee 

Ui (/X) = z ae 

ul (1/X) = faa St r>2 

py(1/X) = In ona, 


Our implementation of the method involves an algorithm that keeps 
track of the moments of the X; variables and variables formed at interme- 
diate stages in the calculation of Y. 
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5. Approximating with a Pearson distribution 


The Pearson system of statistical distributions includes normal, uniform, 
Student’s t, chi-square, gamma, beta and arc-sine distributions as special 
cases.4 The system contains a unique distribution for every possible set of 
the moments p4, H2, 3 and 4, for matching to the same moments of Y. 
The flexibility of the system means that the distribution obtained will be 
a good approximation to the distribution of Y in practice. 

We find the interval enclosing the specified proportion of this Pearson 
distribution from its standardized form. Percentage points of standardized 
Pearson distributions indexed by (; (or ./1) and fz are available in tables 
for certain probability levels,’ or may be calculated by a simple approx- 
imate method for an arbitrary probability level.”® For greater accuracy, 
the percentage points could be found from the distribution itself, whose 
probability density function can be obtained from the moments. 


6. Examples 


(i) Suppose Y = In X; x exp X2 where Xı ~ U[9, 11] and Xə ~ U[3, 3.1] in- 
dependently. (The notation ~ means ‘is distributed as’.) The method gives 
ui (Y) = 48.569, po(Y) = 3.476, /8:(Y) = 0.0314 and 6o(Y) = 2.389. By 
interpolating in the tables,° we find that the interval containing the central 
95% of the Pearson distribution with this set of moments is [45.1, 52.1]. 
Monte Carlo simulation using 10° trials gave the interval [45.1234 0.007, 
52.188-+ 0.007], where the uncertainties represent 2 standard deviations. 


(ii) Suppose Y = X 1X2 + Xz where X; ~ N(10,4), X2 ~ N(10,1) and 
X3 ~ U([90,110] independently. We find ui (Y) = 200, pe(Y) = 537.33, 
VB (Y) = 0.193 and G2(Y) = 3.08, and the 95% interval of Y is [157, 248].5 
Monte Carlo simulation with 10° trials gave [156.6+ 0.1, 247.5 + 0.2]. 


7. Comparison of speed of computation 


The advantage in speed over Monte Carlo simulation for a similar level of 
accuracy depends on the function and inputs, and is not trivial to assess. 
For example, let Y = X1Xə where X, and Xo are independent normal 
variables. In this situation, a 95% interval is obtained in the same time 
as that required by a Monte Carlo simulation involving only 100 unsorted 
trials. If Xı ~ N(3,1) and Xə ~ N(2,1) then the calculated interval 
covers 0.953 of the true distribution of Y, and only 30% of Monte Carlo 
simulations involving 10° trials give intervals that cover a proportion closer 
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to 0.95. However, if Xı ~ N(10,4) and X2 ~ N(10,1) then the calculated 
interval covers 0.9501 of the true distribution of Y, while only 40% of Monte 
Carlo intervals involving 10° trials cover a proportion that is closer to 0.95. 
So, on average, Monte Carlo required much more time to achieve the same 
accuracy, especially in the second case. Figure 1 depicts these two cases. 


(a) (b) 


0 5 10 15 50 100 150 
Figure 1. True and approximate distributions of X1 X2; (a) X1~ N(3,1), X2 ~ N(2,1); 


(b) X1 ~ N(10,4), X2 ~ N(10, 1). In (a) the distributions can be distinguished between 
x = 0 and z = 5. In (b) they are indistinguishable. 


8. Comments 


The method requires the input variables to be independent, so correlated 
variables have to be treated to put the problem in the required form. 
For example, if X; and Xz in example (ii) are dependent then we write 
Y = X4 + X3 where X4 = X1X2, and the technique can be applied if the 
moments of X4 are calculable by some other means. 

The technique is straightforward in theory but is challenging to imple- 
ment. For example, an aspect not discussed here is the slight truncation of 
variables with infinite tails in order to avoid numerical problems that are 
caused by large or undefined high-order central moments. 

The method is the subject of patent application PCT /NZ02/00228. 
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SHORT COURSE ON UNCERTAINTY EVALUATION* 
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A short course on uncertainty evaluation was held in association with the in- 
ternational conference on Advanced Mathematical and Computational Tools for 
Metrology (AMCTM 2003). The objectives of the course are stated and a brief 
description given of the lessons learned. 


1. Objectives 


The objectives of the course were to 


Provide the probabilistic basis of uncertainty evaluation 

Present a formulation of the generic problem of uncertainty evalu- 
ation that is consistent with this basis 

Provide implementations (computational tools) for solving this 
problem based on the law of propagation of uncertainty (LPU) 
and the propagation of distributions 

Apply these implementations to uncertainty evaluation problems 
arising in several branches of metrology, and compare them 
Present and show the role of computational tools such as bootstrap 
re-sampling in the context of uncertainty evaluation 

Emphasize the alignment of the coverage to the Guide to the Ex- 
pression of Uncertainty in Measurement (GUM) ! and the supple- 
mental guides being developed to support it. 


2. Content 


In order to fulfill these objectives, the content of the course was 


*Work partially funded under EU project Soffools_MetroNet Contract N. G6RT-CT- 
2001-05061 and supported by the Software Support for Metrology programme of the 
UK’s Department of Trade and Industry. 


(1) 


(2) 


(3) 


(4) 
(5) 
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The probabilistic basis of uncertainty evaluation in metrology. 
Information-based probability and probability density function 
(PDF). Model of measurement. Propagation of PDFs. Coverage 
interval and coverage probability. Estimates of measurands and as- 
sociated uncertainty matrix. Assigning a PDF to a measurand. 
Implementation of uncertainty evaluation I. LPU. Multiple (output) 
measurands. Implicit models. Mutually dependent input quantities. 
Use of matrices. 

Implementation of uncertainty evaluation Il. The propagation of 
distributions. Realization using Monte Carlo simulation (MCS). 
Interpretation of results. 

Applications to various metrology areas. LPU. Propagation of dis- 
tributions. Comparisons. 

Computational tools. Type A uncertainty evaluation. Bootstrap 
re-sampling. Standard uncertainties. Coverage intervals. 


3. Course tutors 


The tutors were Dr Patrizia Ciarlini (IAC-CNR, Rome), Professor Mau- 
rice Cox (NPL), Dr Peter Harris (NPL), Dr Bernd Siebert (PTB) and Dr 
Wolfgang Woger (PTB), experts in various areas of uncertainty evaluation. 


4. Lessons learned 


Lessons learned from the course were 


(1) 


(3) 


The GUM is a rich document, but the recommendations contained 
within it are not always applicable, a statement applying both to 
users and developers of software for uncertainty evaluation. Soft- 
ware providers should be aware that some practitioners may make 
use of software that may not be fit for all purposes to which it is 
applied. Tests for the applicability of the approaches implemented 
in the software would be desirable for quality assurance reasons. 
The GUM contains anomalies, but mainly provides a consistent 
and logical probabilistic basis for uncertainty evaluation. There are 
advantages in applying this probabilistic approach to all uncertainty 
evaluation problems, regardless of the extent of their complexity. 
The basis provides a summary of the knowledge of the values of the 
input and output quantities in a model of measurement in the form 
of probability density functions (PDFs) or joint PDFs. 
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(4) Within the probabilistic approach uncertainties are not estimated 
per se, but are calculated from the PDFs for the values of the rele- 
vant quantities. 

(5) Bayes’ Theorem and the principle of maximum (information) en- 
tropy are valuable in the assignment of PDFs on the basis of avail- 
able information concerning the values of the input quantities. 

(6) Two distinct types of correlation occur in practice: statistical and 
logical. The former arises when measurements are taken pair-wise, 
and the latter when commonality exists in the form of measurements 
taken, e.g., with the same instrument. 

(7) A coverage interval, corresponding to a prescribed coverage proba- 
bility, for the value of the output quantity always exists for a given 
PDF, even though moments of that PDF (e.g., the expectation as a 
best estimate of the value of the output quantity, and the standard 
deviation as the associated standard uncertainty) may not always 
exist. The Cauchy distribution is an example of such a PDF. 

(8) There would be advantage in avoiding the concept of effective de- 
grees of freedom in any new edition of the GUM, since it is incon- 
sistent with the probabilistic basis. It was noted that, according to 
the Joint Committee for Guides in Metrology (JCGM)*, which has 
responsibility for the GUM, the GUM will not be revised per se in 
the foreseeable future. Rather, supplemental guides to the GUM 
are being prepared by the JCGM to address such issues. 

(9) The use of the propagation of distributions (as opposed to LPU) 
provides, in principle, the PDF for the value of the output quantity. 
MCS furnishes a straightforward implementation of the propagation 
of distributions. Such an implementation can be used in any specific 
instance to validate the use of LPU, and the application of the 
Central Limit Theorem as a basis for establishing a coverage interval 
for the value of the output quantity. 
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In recent years, a large number of programs have been equipped with ANOVA 
(Analysis of Variance) functions. The expression of expectation of variance in 
ANOVA must be calculated in order to evaluate each standard uncertainty. 
However, modern software does not yet offer the functionality to calculate this 
expression of expectation of variance. In this study expectations of variance in 
ANOVA were formulated and a new program that calculates the expression of 
the expectation of variance in typical and specific experimental designs and 
displays symbolic expectations of each variance was developed. 


1. Aim of the course 


Results of the European Network “MID-Software™, in which a framework 
and sets of particular software requirements have been developed, were pre- 
sented. The requirements are harmonized among the participating European 
partners from Notified Bodies and industry. Although they are based on the 
European “Measuring Instruments Directive” (MID) and related to applications 
in legal metrology, there are principles and even particular requirements, which 
can be transferred to non-legal environments. The course was aimed at 
presenting the approach and explaining the particular sets of requirements 
developed. 


2. Requirements and validation 


Considering the current discussion on software validation in metrology, the 
definition of requirements plays an essential role. Validation means provision of 


" Work partially funded under EU SofTools_NetroNet Contract N° G6RT-CT-2001-05061. 
° In addition to the author, the course was given by Daryoush Talebi, Ulrich Grottker 
(both PTB, Germany) and Mike Fortune (NWML, UK). 


“ The work is funded by the European commission under the registration 
number G7RT-CT-2001-05064 
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evidence that particular requirements are fulfilled. This means that requirements 
must be clear when a validation is going to be carried out. The importance of 
harmonization of requirements is underlined by the fact that similar 
requirements can be verified by similar methods. Insofar, the work undertaken 
in legal metrology is a good example of how software validation can be 
organized on the basis of task division. 

The focus of the software requirements in legal metrology is the security of 
software and data, i.e. the protection of software and measurement data from 
accidental or intended changes. This is an tssue of consumer protection, which 
is a basic objective of legal metrology. However, software security is an issue of 
legal metrology not only but concerns software in metrology in general. It repre- 
sents one class of requirements of metrology software. Further classes are 
- software requirements with respect to conformity of underlying models, 

- software requirements with respect to conformity of standards, etc., 
- software requirements with respect to functional correctness 

- software requirements with respect to quality of implementation, 

- software requirements with respect to numerical stability, 

- software requirements with respect to data acquisition, 

- software requirements with respect to data transmission, 

- software requirements with respect to performance characteristics, 
- software requirements with respect to ergonomic criteria. 


3. Objective of the European Network “MID-Software” 


The Network aims at supporting the implementation of the European 
Measuring Instrument Directive (MID) in the software area by removing uncer- 
tainty as to the interpretation of software requirements and by building mutual 
confidence in the results of software testing. The main objective is the 
harmonization of MID implementations with respect to software in all EU 
member states. 

The work is proposed to lead to harmonized software requirements laid 
down as guidance for manufacturers and, as the requirements are simultaneously 
the “technical interface” between manufacturers and Notified Bodies, for 
comparable assessments by Notified Bodies in the various European member 
States 
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4. Definition of requirements 


The original objective was to define and differentiate the software require- 
ments according to different categories of measuring instruments. At an early 
stage of the work, it seemed to be more advantageous to develop the 
requirements on a rather generic basis. The basis is given by typical device- 
independent configurations of measuring instruments instead of instrument 
categories. Furthermore, device-independent risk classes are introduced to 
which measuring instruments are assigned. 

As regards the configuration, the requirements are assigned to two main 
groups: 

egeneric software requirements of basic instruments, 
e generic software requirements of extended IT functionality. 

Two classes are differentiated as regards software requirements of basic in- 
struments: 

-software of built-for-purpose instruments, 
-software of instruments that use a universal computer. 

To show the different conditions the definition, typical characteristics of 
the two classes are given. 


A built-for-purpose instrument is a measuring instrument with an 
embedded software system. It is characterized by the following features: 


-The entire software of the measuring instrument is constructed for the 
measuring purpose. 


-The software normally is not intended to be modified after approval. 
-The user interface is dedicated to the measuring purpose. 


-The software is invariable and there are no means for programming 
or loading software. 


-The device may have interfaces for data transmission. The respective 
requirements have to be observed. 


-The device may have the ability to store measurement data. The 
respective requirements have to be observed. 


Requirements for software of instruments that use a universal computer 
must observe the following features in particular: 


-Any operating system and supporting software (e.g. drivers) may be 
used. 
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-In addition to the measuring instrument application, other software 
applications may at the same time also reside on the system. 

-The computer system may standalone, be part of a closed network, or 
be connected to open networks. 

-Storage of measurement data may be local or remote, fixed or 
removable. 

-The user interface may be switched from an operating mode 
dedicated to a measuring function to a mode, which is not, and vice 
versa. 

-As the system is general-purpose, the measuring sensor would 
normally be external to the compen module and be linked to it by a 
communication link. 

The second main group of requirements relates to the extended IT 
functionality of measuring instruments. There are four different groups of 
requirements: 

-software requirements for the storage of measurement data, 

-software requirements for the transmission of measurement data, 

-software requirements for the software download, 

-software requirements for the separation of software into legally 
relevant and legally non-relevant data. 

The requirements for these groups apply only if the device under 
consideration has a respective function. 
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